PCX 



WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification ^ : 

C12N 15/12, C07K 14/705, 16/28, COIN 
33/566, C07K 14/435, C12Q 1/68, C12N 
5/10 


A2 


(U) International Publication Number: WO 97/45541 
(43) International Publication Date: 4 December 1997 (04.12.97) 


(21) International Application Number: PCT/US97/09SS3 

(22) International Filing Date: 2 June 1997 (02.06.97) 

(30) Priority Data: 

08/656,055 31 May 1996 (31.05.96) US 


(81) Designated States: AL, AM, AT. AU, AZ, BA, BB. BG, BR, 
BY, CA. CH. CN, CU, CZ, DE, DK, EE. ES, FI, GB. GE, 
HU. IL, IS, JP, KE, KG, KP, ICR, KZ, LC, LK, LR. LS, 
LT, LU, LV. MD, MG, MK, MN. MW. MX, NO, NZ, PL, 
PT, RO. RU, SO, SE, SG, SI. SK, TJ. TM, TR. TT. UA. 
UG, UZ, VN, ARIPO patent (GH, KE. LS. MW. SD. SZ, 
UG), Eurasian patent (AM, AZ. BY, KG, KZ, MD, RU. TJ. 
TM), European patent (AT. BE, CH. DE, DK, ES, FI, FR, 



(71) Applicants: THE LELAND S. STANFORD JUNIOR UNI- 

VERSITY [US/US]; Office of Technology Licensing. Suite 
350, 900 Welch Road, Palo Alto, CA 94304 (US). THE 
REGENTS OF THE UNIVERSITY OF CALIFORNIA 
[US/US); 22nd floor, 300 Ukeside Drive, Oakland, CA 
94612 (US). 

(72) Inventors: SCOTT, Matthew. P.; 914 Wing Place, Stonforti. 

CA 94035 (US). GOODRICH. Lisa, V.; 66 Newell Road. 
Palo Alto, CA 94303 (US). JOHNSON, Ronald, L.; 
Apartment 7, 1 528 Hudson Street. Redwood City, CA 9406 1 
(US). EPSTEIN, Ervin, Jr.; 553A Miner Road, Orinda, CA 
94563 (US). ORO, Anthony; J 120 Welch Road #216, Palo 
Alto, CA 94304 (US). 



GB, GR. IE, IT. LU, MC. NL, PT. SE), OAPI patent (BF. 
BJ, CF, CG, CI, CM. GA, GN, ML, MR, NE, SN, TD, TG). 



Published 

Without international search report and to he republished 
upon receipt of that report. 



(74) Agents: ARNOLD. Beth. E. et al.; Foley. Hoag & Eliot LLP, 
One Post Office Square, Boston, MA 02109 (US). 



(54) Title: PATCHED GENES AND THEIR USES 
(57) Abstract 



Methods for isolating patched genes, particularly mammalian patched genes, including the mouse and human patched genes, as well 
as invertebrate patched genes and sequences, are provided. Decreased expression of patched is associated with the occurrence of human 
cancers, particularly basal cell carcinomas of the skin. The cancers may be familial, having as a component of risk an inherited genetic 
predisposition, or may be sporadic. The patched and hedgehog genes arc useful in creating transgenic animal models for these human 
cancers. The patched nucleic acid compositions find use in identifying homologous or related proteins and the DNA sequences encoding 
such proteins; in producing compositions that modulate the expression or function of the protein; and in smdying associated 1 5 physiological 
pathways. In addition, modulation of the gene activity in vivo is used for prophylactic and therapeutic purposes, such as treatment of cancer, 
identification of cell type based on expression, and the like. The DNA is further used as a diagnostic for a genetic predisposition to cancer, 
and to identify specific cancers having mutations in this gene. 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used Co identify States pany to .he PCT on the front pages of pamphlets publishing intentional applications under the PCT. 



AL 


Albuiit 


ES 


AM 


Aimenia 


Fl 


AT 


Aiistm 


FR 


AU 


Australia 


GA 


AZ 


Azerbaijan 


GB 


BA 


Bosnu and Herzegovina 


GE 


BB 


Birtwdos 


GH 


BE 


Belgium 


GN 


BP 


Bwkuu Faso 


GR 


BC 


Bulgaiu 


HU 


BJ 


Benin 


IB 


BR 


Brazil 


IL 


BY 


Belarus 


IS 


CA 


Canada 


IT 


CF 


Central African Republic 


JP 


CG 


Congo 


K£ 


CH 


SwitzcTUnd 


KG 


CI 


Cfite d'lvotre 


KP 


CM 


Camexoon 




CN 


China 


KR 


cu 


Cuba 


KZ 


cz 


Czech Republic 


LC 


DB 


Geimany 


U 


DK 


Denmark 


LK 


EE 


Hstonia 


LR 



Spain 
Finland 
France 
Gabon 

Uaked Kingdom 

Georgia 

Ghana 

Guinea 

Greece 

Hungary 

Ireland 

Urael 

Iceland 

Italy 
ia{un 

Kenya 

Kyrgyzstan 
DcmocrBiic People's 
Republic of Korea 
Rqwblic of Korea 
Kazak&tan 
Saint Lucia 
Liechtenstein 
Sri Unica 
Liberia 



LS Lesotho 

LT Lithuania 

LU Luxembourg 

LV Latvia 

MC Monaco 

MD Republic of Moldova 

MG Madagascar 

MK The rormer Yugoalav 

Republic of Macedonia 
ML Mali 

MN Mongolia 

MR Mauritania 

MW Malawi 

MX Mcaico 
NE . Niger 

NL Netherlands 

NO Norway 

NZ New Zealand 

PL Poland 

PT Portugal 

RO Romania 

RU Russian Fedcraiion 

SD Sudan 

SE Sweden 

SG Singapore 



sr 


Slovenia 


SK 


Slovakia 


SN 


Senegal 


sz 


Swaziland 


TD 


Chad 


TC 


Togo 


TJ 


Tajikistan 


TM 


'I\nicmenistan 


TR 


Tbrkey 


TT 


Trinidad and Tobago 


UA 


Ukraine 


UG 


Uganda 


US 


United States of America 


uz 


Uzbekistan 


VN 


Viet Nam 


YU 


Yugoslavia 


zw 


Zimbabwe 



wo 97/45541 



PCTAJS97/09553 



.1- 

5 PATCHED GENES AND THEIR USES 

This invention was made with support from the Howard Hughes Medical Institute. The 
Government may have certain rights in this invention. 

10 INTRODUCTION 
Technical Field 

The field of this invention is segment polarity genes and their uses. 
Background 

Segment polarity genes were originally discovered as mutations in flies that change the 
1 5 pattern of body segment structures. Mutations in these genes cause animals to develop changed 
patterns on the surfaces of body segments; the changes affecting the pattern along the head to 
tail axis. Among the genes in this class are hedgehog, which encodes a secreted protein (HH), 
^patched, which encodes a protein structurally similar to transporter proteins, having twelve 
transmembrane domains {ptc\ with two conserved glycosylation signals, 
20 The hedgehog gene of flies has at least three vertebrate relatives- Sonic hedgehog (Shh); 

Indian hedgehog (Ihh), and Desert hedgehog (Dhh). Shh is expressed in a group of cells, at 
the posterior of each developing limb bud, that have an important role in signaling polarity to 
the developing limb. The Shh protein product, SHH, is a critical trigger of posterior limb 
development, and is also involved in polarizing the neural tube and somites along the dorsal 
25 ventral axis. Based on genetic experiments in flies, patched SinA hedgehog have antagonistic 
effects in development. Tht patched gene product, pic, is widely expressed in fetal and adult 
tissues, and plays an important role in regulation of development. Pic downregulates 
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5 transcription of itself members of the transforming growth factor P and Wnt gene families, and 
possibly other genes. Among other activities, HH upregulates expression of patched other 
genes that are n«?gatively regulated by patched. 

It is of interest that many genes involved in the regulation of growth and control of 
ceUular signaling are also involved in oncogenesis. Such genes may be oncogenes, which are 
10 typically upregulated in tumor cells, or tumor suppressor genes, which are down-regulated or 
absem in tumor cells. Malignancies may arise when a tumor suppressor is lost and/or an 
oncogene is inappropriately activated. Familial predisposition to cancer may occur when there 
is a mutation, such as loss of an allele encoding a suppressor gene, present in the germline DNA 
of an individual. 

1 5 The most common form of cancer in the United States is basal cell carcinoma of the skin. 

While sporadic cases are very common, there are also familial syndromes, such as the basal ceU 
nevus syndrome (BCNS). The familial syndrome has many features indicative of abnormal 
embryonic development, indicating that the mutated gene also plays an important role in 
development of the embryo. A loss of heterozygosity of chromosome 9q alleles in both familial 
20 and sporadic carcinomas suggests that a tumor suppressor gene is present in this region. The 
high incidence of skin cancer makes the identification of this putative tumor suppressor gene of 
great interest for diagnosis, therapy, and drug screening. 
Relevant r it>.rfinirf 

Descriptions of patched, by itself or its role with hedgehog may be found in Hooper and 
25 Scott (1989) Cdl 59-.75I-765; and Nakano etal (1989) tJaluifi 341 -.508-513. Both of these 
references also describe the sequence for Drosophila patched. Discussions of the role of 
/>ec/^e/w^ include Riddle e/ (1 993) Cdl 75-. 140I-1416-. Echdard e/ a/. (1993) CfiU ^ 
1430-Krausse/ai (1993) CsU 75:1431-1444 (1993); Tabata and Romberg (1994) 76:89-102; 
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5 Heemskerk and DDSfardo (1994) Cfiil 76:449-460; and Roelink ^/ a/. (1994) Cell 76>761-775. 
Mapping of deleted regions on chromosome 9 in skin cancers is described in Habuchi 
ei al (1995) QncQgenfi 11: 1 671-1674, Quinn et aL (1 994) Genes Chrom osome Cancer 
11:222-225; Quinn et al (1994) J Invest nSDIialBl. 102:300-303; and Wicking et al. (1994) 
ikllQmi£S.22:505-51 L 

1 0 Gorlin ( 1 987) Medicine 66:98- 1 1 3 reviews nevoid basal cell carcinoma syndrome. The 

syndrome shows autosomal dominant inheritance with probably complete penetrance. About 
60% of the cases represent new mutations. Developmental abnormalities found with this 
syndrome include rib and craniofacial abnormalities, Polydactyly, syndactyly and spina bifida. 
Tumors found with the syndrome include basal cell carcinomas, fibromas of the ovaries and 

15 heart, cysts of the skin, jaws and mesentery, meningiomas and medulloblastomas. 

SUMMARY OF THE INVENTION 
Isolated nucleotide compositions and sequences are provided for patched (ptc) genes, 
including mammalian, e.g. human and mouse, and invertebrate homologs. Decreased 
20 expression of ptc is associated with the occurrence of human cancers, particularly basal 
cell carcinomas and other tumors of epithelial tissues such as the skin. The cancers may be 
familial, having as a component of risk a germline mutation in the gene, or may be sporadic. 
Ptc, and its antagonist hedgehog, are useful in creating transgenic animal models for these 
human cancers. The ptc nucleic acid compositions find use in identifying homologous or 
25 related genes; in producing compositions that modulate the expression or function of its encoded 
protein, ptc\ for gene therapy, mapping functional regions of the protein- and in studying 
associated physiological pathways. In addition, modulation of the gene activity in vivo is used 
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5 for prophylactic and therapeutic purposes, such as treatment of cancer, identification of cell type 
based on expression, and the like. Ptc, anii-ptc antibodies and ptc nucleic acid sequences are 
usefiil as diagnostics for a genetic predisposition to cancer or developmental abnonnality 
syndromes, and to identify specific cancers having mutations in this gene. 

1 0 BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a graph having a restriction map of about 10 Icbp of the 5" region upstream from 
the initiation codon of Drosophila patched gtne and bar graphs of constructs of truncated 
portions of the 5' region joined to fl-galactosidase, where the constructs are introduced into fly 
cell lines for the production of embryos. The expression of fl-gal in the embryos is indicated 
15 in the right-hand table during early and late development of the embryo. The greater the 
number of +'s, the more intense the staining. 

Fig. 2 shows a summary of mutations found in the human patched gene locus that are 
associated with basal cell nevus syndrome. Mutation (1) is found in sporadic basal cell 
carcinoma, and is a C to T transition in exon 3 at nucleotide 523 of the coding sequence, 
20 changing Leu 1 75 to Phe in the first extracellular loop. Mutations 2-4 are found in hereditary 
basal carcinoma nevus syndrome. (2) is an insertion of 9 bp at nucleotide 2445, resulting in the 
insertion of an additional 3 amino acids after amino acid 815. (3) is a deletion of 1 1 bp, which 
removes nt 2442-2452 from the coding sequence. The resulting frameshift truncates the open 
reading frame after amino acid 8 13, "ust after the seventh transmembrane domain. (4) is a G to 
25 C alteration that changes two conserved nucleotides of the 3' splice site adjacent to exon 10, 
creating a non-functional spUce site that truncates the protein after amino acid 449. in the second 
transmembrane region. 
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5 DATABASE REFERENCES FOR NUCLEOTIDE AND AMINO ACID SEQUENCES 
The sequence for the D. mekmogaster patched gene has the Genbank accession 
number M28418. The sequence for the mouse /7a/c/je^ gene has the Genbank accession 
number It30589-V46155, The sequence for the human patched gene has the Genbank 
accession number U59464. 

10 

DESCRIPTION OF THE SPECIFIC EMBODIMENTS 
Mammalian and invertebrate pa/cAe^ (ptc) gene compositions and methods for their 
isolation are provided. Of particular interest are the human and mouse homologs. Certain 
human cancers, e.g. basal cell carcinoma, transitional cell carcinoma of the bladder, 
15 meningiomas, meduUoblastomas, etc., show decreased ptc activity, resulting from oncogenic 
mutations at thep/c locus. Many such cancers are sporadic, where the tumor cells have a 
somatic mutation mptc. The basal cell nevus syndrome (BCNS), an inherited disorder, is 
associated with germline mutations \nptc. Such germline mutations may also be associated 
with other human cancers, including carcinomas, adenocarcinomas, sarcomas and the like. 
20 Decreased /?/c activity is also associated with inherited developmental abnormalities, e.g. rib and 
craniofacial abnormalities, Polydactyly, syndactyly and spina bifida. 

Ihtptc genes and fragments thereoC encoded protein, and znix-ptc antibodies are useful 
in the identification of individuals predisposed to development of such cancers and 
developmental abnormalities, and in characterizing the phenotype of sporadic tumors that are 
25 associated with this gene, e.g., for diagnostic and/or prognostic benefit. The characterization 
is useful for prenatal screening, and in determining fiirther treatment of the patient. Tumors 
may be typed or staged as to the/7/c status, e.g. by detection of mutated sequences, antibody 
detection of abnormal protein products, and ftinctional assays for altered ptc activity. The 
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5 encoded ptc protein is useful in drug screening for compositions that mimic ptc activity or 
expression, including altered forms otptc protein, particularly with respect to ptc function as 
a tumor suppressor in oncogenesis. 

The human and mouse ptc gene sequences and isolated nucleic acid compositions arc 
provided. In identifying the mouse and human patched genes, cross-hybridization of DNA and 

10 amplification primers were employed to move through the evolutionary tree firom the known 
Drosophila ptc sequence, identifying a number of invertebrate homologs. The human patched 
gene has been mapped to human chromosome band 9q22.3, and lies between the polymorphic 
maricers D9S196 and D9S287 (a detailed map of human genome markers may be found in Dib 
etai (1 996) Nature 280-152-1 http://www.genethon.fT). 

15 DNA from a patient having a tumor or developmental abnormality, which may be 

associated with ptc, is analyzed for the presence of a predisposing mutation in the/>rc gene. 
The presence of a mutated ptc sequence that affects the activity or expression of the gene 
product, ptc, confers an increased susceptibility to one or more of these conditions. Individuals 
are screened by analyang their DNA for the presence of a predisposing oncogenic or 

20 developmental mutation, as compared to a normal sequence. A **normal" sequence of patched 
is provided in SEQ ID NO-. 1 8 (human). Specific mutations of interest include any mutation 
that leads to oncogenesis or developmental abnormalities, including insertions, substitutions and 
deletions in the coding region sequence, introns that affect splicing, promoter or enhancer that 
affect the activity and expression of the protein. 

25 Screening for tumors or developmental abnormalities may also be based on the 

functional or antigenic characteristics of the prote'm. Immunoassays designed to detect the 
normal or abnonnal ptc protein may be used in screening. Where many diverse mutations lead 
to a particular disease phenotype, funaional protein assays have proven to be effective screening 
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5 tools. Such assays may be based on deteaing changes in the transcriptional regulation 
mediated by ptc, or may directly detect ptc transporter activity, or may involve antibody 
localization of patched in cells. 

Inheritance of BCNS is autosomal dominant, although many cases are the result of new 
mutations. Diagnosis of BCNS is performed by protein, DNA sequence or hybridization 
10 analysis of any convenient sample from a patient, e.g. biopsy material, blood sample, scrapings 
from cheek, etc. A typical patient genotype will have a predisposing mutation on one 
chromosome. In tumors and at least sometimes developmentally affected tissues, loss of 
heterozygosity at the/?/c locus leads to aberrant cell and tissue behavior. When the normal 
copy of ptc is lost, leaving only the reduced function mutant copy, abnormal cell growth and 
1 5 reduced cell layer adhesion is the result. Examples of specific ptc mutations in BCNS patients 
are a 9 bp insertion at nt 2445 of the coding sequence- and an 1 1 bp deletion of nt 244 1 to 2452 
of the coding sequence. These result in insertions or deletions in the region of the seventh 
transmembrane domain. 

Prenatal diagnosis of BCNS may be performed, particularly where there is a family 
20 history of the disease, e.g. an affected parent or sibling. It is desirable, although not required, 
in such cases to determine the specific predisposing mutation present in affected family 
members. A sample of fetal DNA, such as an amniocentesis sample, fetal nucleated or white 
blood cells isolated from maternal blood, chorionic villus sample, etc. is analyzed for the 
presence of the predisposing mutation. Alternatively, a protein based assay, e.g. functional 
25 assay or immunoassay, is performed on fetal cells known to express ptc. 

Sporadic tumors associated with loss of pic function include a number of carcinomas and 
other transformed cells known to have deletions in the region of chromosome 9q22, e.g. basal 
cell carcinomas, transitional bladder cell carcinoma, meningiomas, medullomas, fibromas of the 
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5 heart and ovary, and carcinomas of the lung, ovary, kidney and esophagus. Characterization 
of sporadic tumors will generally require analysis of tumor cell DNA, conveniently with a biopsy 
sample. A wide range of mutations are found in sporadic cases, up to and including deletion 
of the entire long arm of chromosome 9. Oncogenic mutations may delete one or more exons, 
e.g. S and 9, may affect the amino acid sequence such as of the extracellular loops or 

10 transmembrane domains, may cause truncation of the protein by introducing a frameshift or stop 
codon, etc. Specific examples of oncogenic mutations include a C to T transition at nt 523-1 
and deletions encompassing exon 9. C to T transitions are characteristic of ultraviolet 
mutagenesis, as expected with cases of skin cancer. 

Biochemical studies may be performed to determine whether a candidate sequence 

1 5 variation in the ptc coding region or control regions is oncogenic. For example, a change in the 
promoter or enhancer sequence that downregulates expression of patched may result in 
predisposition to cancer. Expression levels of a candidate variant allele are compared to 
expression levels of the normal allele by various methods known in the art. Methods for 
determining promoter or enhancer strength mclude quantitation of the expressed natural protein; 

20 insertion of the variant control element into a vector with a reporter gene such as R- 
galacto^dase, chloramphenical acetyltransferase, etc. that provides for convenient quantitation- 
and the like. The activity of the encoded ptc protein may be determined by comparison with 
the wild-type protein, e.g. by detection of transcriptional down-regulation of TGFP, Wni family 
genes, itself, or reporter gene fusions involving these target genes. 

25 The human patched gene (SEQ ID NO: 1 8) has a 4.5 kb open reading frame encoding 

a protein of 1447 amino acids. Including coding and noncoding sequences, it is about 89% 
identical at the nucleotide level to the mouse patched gene (SEQ ID NO-.09). The mouse 
patched gene (SEQ iD NO:09) encodes a protein (SEO ID NO: 10) that has about 38% identical 
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5 amino acids to Drosophila ptc (SEQ ID NO :6), over about 1 ,200 amino acids. The butterfly 
homolog (SEQ ID N0:4) is 1,300 amino acids long and overall has a 50% amino acid identity 
to fly ptc (SEQ ID NO:6). A 267 bp exon from the beetle patched gene encodes an 89 amino 
acid protein fragment, which was found to be 44% and 51% identical to the corresponding 
regions of fly and butterfly ptc respectively. 
1 0 The DNA sequence encoding ptc may be cDNA or genomic DNA or a fragment thereof 

The term ''patched gene" shall be intended to mean the open reading frame encoding specific 
ptc polypeptides, as well as adjacent 5* and 3* non-coding nucleotide sequences involved in the 
regulation of expression, up to about 1 kb beyond the coding region, in either direction. The 
gene may be introduced into an appropriate vector for extrachromosomal maintenance or for 
1 5 integration into the host. 

The term "cDNA** as used herein is intended to include all nucleic acids that share the 
arrangement of sequence elements found in native mature mRNA species, where sequence 
elements are exons, 3' and 5' non-coding regions. Normally MRNA species have contiguous 
exons, with the intervening introns deleted, to create a continuous open reading frame encoding 
20 ptc. 

The genomic ptc sequence has non-contiguous open reading frames, where introns 
interrupt the coding r^ons. A genomic sequence of interest comprises the nucleic acid present 
between the initiation codon and the stop codon, as defined in the listed sequences, including 
all of the introns that are normally present in a native chromosome. It may fijrther include the 
25 3* and 5' untranslated regions found in the mature MRNA It may fiirther include specific 
transcriptional and translational regulatory sequences, such as promoters, enhancers, etc., 
including about 1 kb of flanking genomic DNA at either the 5' or 3' end of the coding region. 
The genomic DNA may be isolated as a fragment of 50 kbp or smaller, and substantially free 
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5 of flanicing chromosomal sequence. 

The nucleic add compo^tions of the subject invention encode ail or a part of the subject 
polypeptides. Fragments may be obtained of the DNA sequence by chemically synthesizing 
oligonucleotides in accordance with conventional methods, by restriction enzyme digestion, by 
PCR amplification, etc. For the most part, DNA fragments will be of at least 1 5 nt, usually at 

10 least 18 nt, more usually at least about 50 nt. Such small DNA fragments arc useful as primers 
for PCR, hybridization screening, etc. Larger DNA fragments, i.e. greater than 100 nt are 
useful for production of the encoded polypeptide. For use in amplification reactions, such as 
PCR, a pair of primers will be used. The exact composition of the primer sequences is not 
critical to the invention, but for most applications the primers will hybridize to the subject 

15 sequence under stringent conditions, as known in the art. It is preferable to chose a pair of 
primers that will generate an amplification product of at least about 50 nt, preferably at least 
about 100 nt. Algorithms for the selection of primer sequences are generally known, and are 
available in commercial software packages. Amplification primers hybridize to complementary 
strands of DNA, and will prime towards each other 

20 The pic genes are isolated and obtained in substantial purity, generally as other than an 

intact mammalian chromosome. Usually, the DNA will be obtained substantially free of other 
nucldc add sequences that do not include a pic sequence or fragment thereof, generally being 
at least about 50%, usually at least about 90% pure and are typically "recombinant", i.e. flanked 
by one or more nucleotides with which it is not normally associated on a naturally occurring 

25 chromosome. 

The DNA sequences are used in a variety of ways. They may be used as probes for 
identifying other patched genes. Mammalian homologs have substantial sequence similarity to 
the subject sequences, i.e. at least 75%, usually at least 90%, more usually at least 95% 
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5 sequence identity with the nucleotide sequence of the subject DNA sequence. Sequence 
sinnilarity is calculated based on a reference sequence, which may be a subset of a larger 
sequence, such as a conserved motif, coding region, flanking region, etc. A reference sequence 
will usually be at least about 18 nt long, more usually at least about 30 nt long, and may extend 
to the complete sequence that is being compared. Algorithms for sequence analysis are known 
10 in the art, such as BLAST, described in Altschul ei aL (1990) JMol Biol 21 5 ; 403-10. 

Nucleic acids having sequence similarity are detected by hybridization under low 
stringency conditions, for example, at SOT and lOXSSC (0-9 M saIine/0.09 M sodium citrate) 
and remain bound when subjected to washing at 55*C in IXSSC. By using probes, particularly 
labeled probes of DNA sequences, one can isolate homologous or related genes. The source of 
15 homologous genes may be any mammalian species, e.g. primate species, particulariy human- 
murines, such as rats and mice, canines, felines, bovines, ovines, equines, etc. 

The DNA may also be used to identify expression of the gene in a biological specimen. 
The manner in which one probes cells for the presence of particular nucleotide sequences, as 
genomic DNA or RNA, is well-established in the literature and does not require elaboration 
20 here. Conveniently, a biological specimen is used as a source of MRNA. The MRNA may be 
amplified by RT-PCR, using reverse transcriptase to form a complementary DNA strand, 
followed by polymerase chain reaction amplification using primers specific for the subject DNA 
sequences. Alternatively, the MRNA sample is separated by gel electrophoresis, transferred to 
a suitable support, e.g.. nitrocellulose and then probed with a fi-agment of the subject DNA as 
25 a probe. Other techniques may also find use. Detection of MRNA having the subject sequence 
is indicative of patched gene expression in the sample. 

The subject nucleic acid sequences may be modified for a number of purposes, 
particularly where they will be used intracellulariy, for example, by being joined to a nucleic acid 
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5 cleaving agent, e g. a chelated metal ion, such as iron or chromium for cleavage of the gene; as 
an antisense sequence-, or the like. Modifications may include replacing oxygen of the 
phosphate esters with sulfur or nitrogen, replacing the phosphate with phosphoramide, etc. 

A number of n^ods are available for analyzing genomic DNA sequences. Where large 
amounts of DNA are available, the genomic DNA is used directly. Alternatively, the region of 

10 int«-est is cloned into a suitable vector and grown in suflficient quantity for analysis, or amplified 
by conventional techniques, such as the polymerase chain reaction (PGR). The use of the 
polymerase chain reaction is described in Saiki, et ai (1 985) Science 239@487, and a review 
of current techniques may be found in Sambrook, et ai Molecular Cloning: A Laboratory 
Manual, CSH Press 1989, pp. 14.2-14.33. 

15 A detectable label may be included in the amplification reaction. Suitable labels include 

fluorochromes, e.g. fluorescein isothiocyanate (FITC), rhodamine, Texas Red, phycoerythrin, 
allophycocyanin, 6-carboxyfluorescein (6-F AM), 2', 7'-dimethoxy-4*, 5 -dichloro-6- 
carboxyfluorescein (JOE), 6-carboxy-Xrhodamine (ROX), 6-carboxy-2',4',7,4,7- 
hexachlorofiuorescein (HEX), 5-carboxyfluorescein (5-FAM) or HNjN'jN'-tetramethyW- 

20 carboxyrhodamine (TAMRA), radioaaive labels, e.g. ^^P, ^^S, ^H; etc. The label may be a two 
stage system, where the amplified DNA is conjugated to biotin, haptens, etc. having a high 
affinity binding partner, e.g. avidin, specific antibodies, etc., where the binding partner is 
conjugated to a detectable label. The label may be conjugated to one or both of the primers. 
Alternatively, the pool of nucleotides used in the amplification is labeled, so as to incorporate 

25 the label Into the amplification product. 

The amplified or cloned fi-agment may be sequenced by dideoxy or other methods, and 
the sequence of bases compared to the normal pic sequence. Hybridization with the variant 
sequence may also be used to determine its presence, by Southern blots, dot blots, etc. Single 
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S strand confoimational polym rphism (SSCP) analysis, denaturing gradient gel electrophoresis 
(DGGE), and heteroduplex analysis in gel matrices are used to detect conformational changes 
created by DNA sequ&ice variation as alterations in electrophoretic mobility. The hybridization 
pattern of a control and variant sequence to an array of oligonucleotide probes immobilized on 
a solid support, as described in WO 95/1 1995, may also be used as a means of detecting the 

10 presence of variant sequences. Alternatively, where a predisposing mutation creates or destroys 
a recognition site for a restriction endonuciease, the fragment is digested with that endonuclease, 
and the products size fractionated to determine whether the fragment was digested. 
Fractionation is performed by gel electrophoresis, particularly acrylamidc or agarose gels. 

The subject nucleic acids can be used to generate transgenic animals or site specific gene 

1 5 modifications in cell lines. Transgenic animals may be made through homologous recombination, 
where the normal patched locus is altered. Alternatively, a nucleic acid construct is randomly 
integrated into the genome. Vectors for stable integration include plasmids, retroviruses and 
other animal viruses, YACS, and the like. 

The modified cells or animals are useful in the study of patched function and regulation. 

20 For example, a series of small deletions and/or substitutions may be made in the patched gene 
to determine the role of different exons in oncogenesis, signal transduction, etc. Of particular 
interest are transgenic animal models for carcinomas of the skin, where expression of ptc is 
specifically reduced or absent in skin cells. An alternative approach to transgenic models for this 
disease are those where one of the mammalian hedgehog genes, e,g. Shh, Ihh, Dhh, are 

25 upregulated in skin cells, or in other cell types. For models of skin abnormalities, one may use 
a skin-spedfic promoter to drive expression of th transgen , or other inducible promoter that 
can be regulated in the animal model. Such promoters include keratin gene promoters. Specific 
constructs of interest include anti-sense ptc, which will block ptc expression, expression of 
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5 dominant negative pic mutations, and over-expression of HH genes. A detectable marker, such 
aft lacZ may be introduced into the patched locus, where upregulation of patched expression will 
result in an easily detected change in phenotype. 

One may also provide for expression of the patched gene or variants thereof in cells or 
tissues where it is not normally expressed or at abnormal times of development. Thus, mouse 

10 models of spma bifida or abnormal motor neuron differentiation in the developing spinal cord 
are made available. In addition, by providing expression of pic protein in cells in which it is 
otherwise not normally produced, one can induce changes in cell behavior, e.g. through pic 
mediated transcription modulation. 

DNA constructs for homologous recombination will comprise at least a portion of the 

1 5 patched or hedgehog gene with the desired genetic modification, and will include regions of 
homology to the target locus. DNA constructs for random integration need not include regions 
of homology to mediate recombination. Conveniently, markers for positive and negative 
selection are included. Methods for generating cells having targeted gene modifications through 
homologous recombination are known in the art. For various techniques for transfecting 

20 mammalian cells, see Keown et ai (1 990) Methods in Enzvmoloav 185:527-537. 

For embryonic stem (ES) cells, an ES cell line may be employed, or ES cells may be obtained 
freshly firom a host, e.g. mouse, rat, guinea pig, etc. Such cells are grown on an appropriate 
fibroblast-feeder layer or grown in the presence of leukemia inhibiting factor (LIF). When ES 
cells have been transformed, they may be used to produce transgenic animals. After 

IS transformation, the cells are plated onto a feeder layer in an appropriate medium. Cells 
containing the construa may be detected by employing a selective medium. After sufficient time 
for colonies to grow, they are picked and analyzed for the occurrence of homologous 
recombination or integration of the construct. Those colonies that are positive may then be used 
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5 for embryo manipulation and blastocyst injection. Blastocysts are obtained from 4 to 6 week old 
superovulated females. The £S cells are trypsinized, and the modified cells are injected into the 
biastocoei of the blastocyst. After injection, the blastocysts are returned to each uterine horn of 
pseudopregnant females. Females are then allowed to go to term and the resulting litters 
screened for mutant cells having the construct. By providing for a different phenotype of the 
10 blastocyst and the ES cells, chimeric progeny can be readily detected. 

The chimeric animals are screened for the presence of the modified gene and males and 
females having the modification are mated to produce homozygous progeny. If the gene 
alterations cause lethality at some point in development, tissues or organs can be maintained as 
allogeneic or congenic grafls or transplants, or in in vitro culture. The transgenic animals may 
15 be any non-human mammal, such as laboratory animals, domestic animals, etc. The transgenic 
animals may be used in functional studies, drug screening, etc., e.g. to determine the effect of 
a candidate drug on basal cell carcinomas. 

The subjea gene may be employed for producing all or portions of the patched protein. 
For expression, an expression cassette may be employed, providing for a transcriptional and - 
20 translationai initiation region, which may be inducible or constitutive, the coding region under 
the transcriptional control of the transcriptional initiation region, and a transcriptional and 
translationai termination region. Various transcriptional initiation regions may be employed 
which are functional in the expression host. 

Specific p/c peptides of interest include the extracellular domains, particularly in the 
25 human mature protein, aa 120 to 437, and aa 770 to 1027 These peptides may be used as 
immunogens to raise antibodies that recognize the protein in an intact cell membrane. The 
cytoplasmic domains, as shown in Figure 2, (the amino terminus and carboxy temiinus) are of 
interest in binding assays to detect ligands involved in signaling mediated by ptc. 
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5 The peptide may be expressed in prokaryotes or eukaryotes in accordance with 

conventional ways, depending upon the purpose for expression. For large scale production of 
the protein, a unicellular organism or cells of a higher organism, e.g. eukasyotes such as 
vertebrates, particularly mammals, may be used as the expression host, such as £1 coli, B, 
subihis, S. cerevisiae, and the like. In many situations, it may be desirable to express the patched 

10 gene in a mammalian host, whereby the patched gene will be glycosylated, and transported to 
the cellular membrane for various studies. 

With the availability of the protein in large amounts by employing an expression host, 
the protein may be isolated and purified in accordance with conventional ways. A lysate may be 
prepared of the expression host and the lysate purified using HPLC, exclusion chromatography, 

15 gel electrophoresis, affinity chromatography, or other purification technique. The purified 
protein will generally be at least about 80% pure, preferably at least about 90% pure, and may 
be up to and including 100% pure. By pure is intended free of other proteins, as well as cellular 
debris. 

The polypeptide is used for the production of antibodies, where short fragments provide 
20 for antibodies spedfic for the particular polypeptide, whereas larger fi^agments or the entire gene 
allow for the production of antibodies over the surface of the polypeptide or protein. Antibodies 
may be raised to the normal or mutated forms of pic* The extracellular domains of the protein 
arc of interest as epitopes, particular antibodies that recognize common changes found in 
abnormal, oncogenic /7/c, which compromise the protein activity. Antibodies may be raised to 
25 isolated pepddes corresponding to these domains, or to the native protein, e.g. by immunization 
with cells expressing ptc, immunizadon with Uposomes having ptc inserted in the membrane, etc. 
Antibodies that recognize the extracellular domains of ptc are useful in diagnosis, typing and 
staging of human carcinomas. 
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5 Antibodies are prepared in accordance with conventional ways, where the expressed 

polypeptide or protein may be used as an immunogen, by itself or conjugated to known 
immunogenic carriers, e.g. KLH, pre-S HBsAg, other viral or eukaryotic proteins, or the like. 
Various adjuvants may be employed, with a series of injections, as appropriate. For monoclonal 
antibodies, after one or more booster injections, the spleen may be isolated, the splenocytes 
10 immortalized, and then screened for high affinity antibody binding. The immortalized cells, e.g. 
hybridomas, producing the desired antibodies may then be expanded. For further description, 
see Monoclonal Antibodies- A Laboratory Manual, Hariow and Lane eds., Cold Spring Harbor 
Laboratories, Cold Spring Harbor, New York, 1988. If desired, the MRNA encoding the heavy 
and light chains may be isolated and mutagenized by cloning in £. coli, and the heavy and light 
1 5 chains may be mixed to further enhance the affinity of the antibody. 

The antibodies find particular use in diagnostic assays for developmental abnormalities, 
basal cell carcinomas and other tumors associated with mutations in ptc. Staging, detection and 
typing of tumors may utilize a quantitative immunoassay for the presence or absence of normal 
ptc. Alternatively, the presence of mutated forms of ptc may be determined. A reduction in 
20 nonmalp/c and/or presence of abnormal ptc is indicative that the tumor is p/c-associated. 

A sample is taken from a patient suspected of having a /?/c-associated tumor, 
developmental abnormality or BCNS. Samples, as used herein, include biological fluids such as 
Wood, cerdjro^jinal fluid, tears, saliva, lymph, dialysis fluid and the like- organ or tissue culture 
derived fluids, and fluids extracted from physiological tissues. Also included in the term are 
25 derivatives and fractions of such fluids. Biopsy samples are of particular interest, e.g. skin 
lesions, organ tissue fragments, etc. Where metastasis is suspected, blood samples may be 
preferred. The number of cells in a sample will generally be at least about 103, usually at least 
104 more usually at least about 105. The cells may be dissociated, in the case of solid tissues. 
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5 or tissue sections may be analyzed. Alternatively a lysate of the cells may be prepared. 

Diagnosis may be performed by a number of methods. The different methods all 
determine the absence or presence of normal or abnormal in patient cells suspected of having 
a mutation in ptc. For example, detection may utilize staining of intact ceDs or histological 
sections, performed in accordance with conventional methods. The antibodies of interest are 
10 added to the ceU sample, and incubated for a period of time sufficient to allow binding to the 
epitope, usually at least about 10 minutes. The antibody may be labeled with radioisotopes, 
enzymes, fluorescers, chemiluminescers, or other labels for direct detection. Alternatively, a 
second stage antibody or reagent is used to amplify the signal. Such reagents are well-known 
in the art. For example, the primary antibody may be conjugated to biotin, with horseradish 
15 peroxidase-conjugated avidin added as a second stage reagent. Final detection uses a substrate 
that undergoes a color change in the presence of the peroxidase. The absence or presence of 
antibody binding may be determined by various methods, including flow cytometry of 
dissociated cells, microscopy, radiography, scintillation counting, etc. 

An alternative method for diagnosis depends on the /// vitro detection of binding between 
20 antibodies andpto in a lysate. Measuring the concentration of pic binding in a sample or fraction 
thereof may be acconq)lished by a variety of specific assays. A conventional sandwich type assay 
may be used. For example, a sandwich assay may first attach /7/c-specific antibodies to an 
insoluble surface or support. The particular manner of binding is not crucial so long as it is 
compatible with the reagents and overall methods of the invention They may be bound to the 
25 plates covalently or non-covalently, preferably non-covalently. 

The insoluble aipports may be any compositions to which polypeptides can be bound, 
which is readily separated from soluble material, and which is otherwise compatible with the 
overall method. The surface of such supports may be solid or porous and of any convenient 
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5 shape. Examples of suitable insoluble supports to wliich the receptor is bound include beads, e.g. 
magnetic beads, membranes and microtiter plates. These are typically made of glass, plastic (e.g. 
polystyrene), polysaccharides, nylon or nitrocellulose. Microtiter plates are especially convenient 
because a large number of assays can be carried out simultaneously, using small amounts of 
reagents and samples. 

10 Patient sample lysates are then added to separately assayable supports (for example, 

separate wells of a microtiter plate) containing antibodies. Preferably, a series of standards, 
containing known concentrations of normal and/or abnormal ptc is assayed in parallel with the 
samples or aliquots thereof to serve as controls. Preferably, each sample and standard will be 
added to multiple wells so that mean values can be obtained for each. The incubation time 
1 5 should be sufficient for binding, generally, from about 0. 1 to 3 hr is sufficient. After incubation, 
the insoluble support is generally washed of non-bound components. Generally, a dilute non- 
ionic detergent medium at an appropriate pH, generally 7-8, is used as a wash medium. From 
one to sue washes may be employed, with sufficient volume to thoroughly wash nonspecifically 
bound proteins present in the sample. 
20 After washing, a solution containing a second antibody is applied. The antibody will bind 

ptc with sufficient specificity such that it can be distinguished from other components present. 
The second antibodies may be labeled to facilitate direct, or indirect quantification of binding. 
Examples of labels that permit direct measurement of second receptor binding include 
radiolabds, such aS 3H or 1251, fluorescers, dyes, beads, chemilumninescers, coUoidal particles, 
25 and the like. Examples of labels which permit indirect measurement of binding include enzymes 
where the substrate may provide for a colored or fluorescent product. In a prefen-ed 
embodiment, the antibodies are labeled with a covalently bound enzyme capable of providing 
a detectable product signal after addition of suitable substrate. Examples of suitable enzymes 
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5 for use in conjugates include horseradish peroxidase, alkaline phosphatase, malate 
dehydrogenase and the like. Where not commercially available, such antibody-enzyme 
conjugates are readily produced by techniques known to those skilled in the art. The incubation 
time should be sufficient for the labeled ligand to bind available molecules. Generally, from 
about 0. 1 to 3 hr is sufficient, usually 1 hr sufficing. 

10 After the second binding step, the insoluble support is again washed free of non- 

specifically bound material. The signal produced by the bound conjugate is detected by 
conventional means. Where an enzyme conjugate is used, an appropriate enzyme substrate is 
provided so a detectable product is formed. 

Other immunoassays are known in the art and may find use as diagnostics. Ouchteriony 

1 5 plates provide a simple determination of antibody binding. Western blots may be performed on 
protein gels or protein spots on filters, using a detection system specific for ptc as desired, 
conveniently using a labeling method as described for the sandwich assay. 

Other diagnostic assays of interest are based on the functional properties o^ptc protein 
itself Such assays are particularly useful where a large number of different sequence changes 

20 lead to a common phenotype, i.e., loss of protein function leading to oncogenesis or 
developmental abnormality. For example, a functional assay may be based on the transcriptional 
changes mediated by hedgehog and patched gene products. Addition of soluble Hh to 
embryonic stem cells causes induction of transcription in target genes. The presence of 
functional ptc can be determined by its ability to antagonize Hh activity. Other functional assays 

25 may detect the transport of specific molecules mediated by ptc, in an intact cell or membrane 
fragment. Conveniently, a labeled substrate is used, where the transport in or out of the cell can 
be quantitated by radiography, microscopy, flow cytometry, spectrophotometry, etc. Other 
assays may detea confomiationa! changes, or changes in the subcellular localization of patched 
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5 protein. 

By providing for the production of large amounts of patched protein, one can identify 
ligands or substrates that bind to, modulate or mimic the action of patched A common feature 
in basal cell carcinoma is the loss of adhesion between epidermal and dermal layers, indicating 
a role for ptc in maintaining appropriate cell adhesion. Areas of investigation include the 
10 development of cancer treatments, wound healing, adverse effects of aging, metastasis, etc. 

Drug screening identifies agents that provide a replacement for ptc function in abnormal 
cells. The role of ptc as a tumor suppressor indicates that agents which mimic its function, in 
terms of transmembrane transport of molecules, transcriptional down-regulation, etc., will inhibit 
the process of oncogenesis. These agents may also promote appropriate cell adhesion in wound 
15 healing and aging, to reverse the loss of adhesion observed in metastasis, etc. Conversely, agents 
that reverse ptc fiinction may stimulate controlled growth and healing. Of particular interest are 
screening assays for agents that have a low toxicity for human cells. A wide variety of assays 
may be used for this purpose, including labeled in vitro protein-protein binding assays, 
electrophoretic mobility shift assays, immunoassays for protein binding, and the like. The 
20 purified protein may also be used for determination of three-dimensional crystal structure, which 
can be used for modeling intermolecular interactions, transporter function, etc. 

The term "agent" as used herein describes any molecule, e.g. protein or pharmaceutical, 
with the capability of altering or mimicking the physiological function of patched Generally a 
plurality of assay mbctures are run in parallel with different agent concentrations to obtain a 
25 differential response to the various concentrations. Typically, one of these concentrations serves 
as a negative control, i.e. at zero concentration or below the level of deteaion. 

Candidate agents encompass numerous chemical classes, though typically they are 
organic molecules, preferably small organic compounds having a molecular weight of more than 
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5 50 and less than about 2,500 daltons. Candidate agents comprise functional groups necessary 
for structural interaction with proteins, particularly hydrogen bonding, and typically include at 
least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least two of the fijnctional 
chemical groups. The candidate agents often comprise cyclical carbon or heterocyclic structures 
and/or aromatic or polyaromatic structures substituted widi one or more of the above functional 
10 groups. Candidate agents are also found among biomolecules including peptides, saccharides, 
fetty 'ds, steroids, purines, pyrimidines, derivatives, structural analogs or a combinations thereof 
Candidate agents are obtained from a wide variety of sources including libraries of 
synthetic or natural compounds. For example, numerous means are available for random and 
directed synthesis of a wide variety of organic compounds and biomolecules, including 
1 5 expression of randomized oOgonucleotides and oligopeptides. Alternatively, libraries of natural 
compounds in the form of bacterial, fungal, plant and animal extracts are available or readily 
produced. Additionally, natural or synthetically produced libraries and compounds are readily 
modified through conventional chemical, physical and biochemical means, and may be used to 
produce combinatorial libraries. Known pharmacological agents may be subjected to directed 
20 or random chemical modifications, such as acylation, alkylation, esterification, amidification, etc. 
to produce structural analogs. 

Where the screening assay is a binding assay, one or more of the molecules may be 
joined to a label, where the label can directly or indirectly provide a detectable signal. Various 
labds inchide radioisotopes, fluorescers, chemiluminescers. enzymes, specific binding molecules, 
25 particles, e.g. magnetic particles, and the like. Specific binding molecules include pairs, such as 
biotin and strqjtavidin, digoxin and antidigoxin etc. For the specific binding menAers, the 
con^lementaiy member would normally be labeled with a molecule that provides for detection, 
in accordance with known procedures. 
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S A variety of other reagents may be included in the screening assay. These include 

reagents like salts, neutral proteins, e.g. albumin, detergents, etc that are used to facilitate 
optimal protein-protein binding and/or reduce nonspecific or background interactions. Reagents 
that improve the eflSdency of the assay, such as protease inhibitors, nuclease inhibitors, anti- 
microbial agents, etc. may be used. The mixture of components are added in any order that 

10 provides for the requisite binding. Incubations are performed at any suitable temperature, 
typically between 4** and 40** C, Incubation periods are selected for optimum activity, but may 
also be optimized to facilitate rapid high-throughput screening. Typically between 0.1 and 1 
hours will be sufficient. 

Other assays of interest detect agents that mimic patched function, such as repression. 

15 of target gene transcription, transport of patched substrate compounds, etc. For example, an ^« 
expression construct comprising a patched gene may be introduced into a cell line under 
conditions that allow expression. The level of patched activity is detennined by a functional 
assay, as previously described. In one screening assay, candidate agents are added in 
combination with a Hh protein, and the ability to overcome Hh antagonism of ptc is detected. 

20 In another assay, the ability of candidate agents to enhance ptc function is determined. 
Alternatively, candidate agents are added to a cell that lacks functional ptc, and screened for the 
ability to reproduce ptc in a functional assay. 

The compounds having the desired pharmacological activity may be administered in a 
physiologically acceptable carrier to a host for treatment of cancer or developmental 

25 abnormalities attributable to a defect in patched function. The compounds may also be used to 
enhance patched function in wound healing, aging, etc. The inhibitory agents may be 
administered in a variety of ways, orally, topically, parenterally e.g. subcutaneously, 
intraperitoneally, by viral infection, intravascularly. etc. Topical treatments are of particular 
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5 interest. Depending upon the manner of introduction, the compounds may be formulated in a 
variety of ways. The concentration of therapeutically active compound in the formulation may 
vary from about 0.1-100 wt.%. 

The pharmaceutical compositions can be prepared in various fonns, such as granules, 
tablets, pills, suppositories, capsules, suspensions, salves, lotions and the like. Pharmaceutical 
1 0 grade organic or inorganic carriers and/or dUuents suitable for oral and topical use can be used 
to make up compositions containing the therapeutically-active compounds. Diluents known to 
tiie ait include aqueous media, vegetable and animal oils and fats. Stabilizing agents, wetting and 
emulsifying agents, salts for varying the osmotic pressure or buffers for securing an adequate 
pH value, and skin penetration enhancers can be used as auxiliary agents. 
15 The gene or fragments thereof may be used as probes for identifying the 5' non-coding 

region comprising the transcriptional initiation region, particularly the enhancer regulating tiie 
transcription of patched By probing a genomic library, particularly with a probe comprising the 
5' coding region, one can obtain fragments comprising the 5' non-coding region. If necessary, 
one may walk tiie fragment to obtain fiirther 5* sequence to ensure that one has at least a 
20 functional portion of tiic enhancer. It is found that the enhancer is proximal to the 5' coding 
...^ region, a portion being in the transcribed sequence and downstream from the promoter 
sequences. The transcriptional initiation region may be used for many purposes, studying 
embryonic development, providing for regulated expression of patched protein or other protein 
of interest during embryonic development or thereafter, and in gene therapy. 
25 The gene may also be used for gene therapy. Vectors useful for introduction of the gene 

include plasmids and viral vectors. Of particular interest are retroviral-based vectors, .g. 
moloney murine leukemia virus and modified human immunodeficiency virus- adenovirus 
vectors, etc. Gene tiierapy may be used to treat skin lesions, an afTected fetus, etc., by 
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5 tiansfection of the normal gene into embiyonic stan cells or into other fetal cells. A wide variety 
of viral vectors can be employed for transfection and stable integration of the gene into the 
genome of the cells. Alternatively, micro-injection may be employed, fusion, or the like for 
introduction of genes into a suitable host cell. Se^ for example, Dhawan et al. (1991) Science 
254.1509-1512 and Smith etaL (1 990) Molecular and Cellular Biology 3268-3271. 
10 The following examples are offered by illustration not by way of limitation. 

EXPERIMENTAL 

Methods and Materiak 

PCR on Mosquito (Anopheles gambiae) Genomic UNA. PCR primers were based on 
amino add stretches of fly/7/c that were not likely to diverge over evolutionary time and were 
15 of low degeneracy. Two such primers (P2RI (SEO ID NO-14)- 
GGACGAATTCAARrrTMrAVCARYTNTnTT P4R1: (SEQ ID NO: 15) 
GGACGAATTCrVTrrr AP a ^Bf A>rrr (the underlined sequences are Eco RI linkers) 
amplified an appropriately sized band fi-om mosquito genomic DNA using the PCR. The 
program conditions were as follows: 

20 94''C4min.;72''CAddTaq; 

[49"C 30 sec.; 72'*C 90 sec.; 94"C 15 sec] 3 times 
[94"'C 15 sec.; 50"C 30 sec; 72^ 90 sec] 35 times 
72 °C 10min;4"C hold 

25 This band was subdoned into the EcoRV site of pBluescript n and sequenced using the USB 
Sequence kit. 

Screen of a Butterfly cDNA Library with Mosquito PCR Product. Using the mosquito 
PCR product (SEQ ID N0:7) as a probe, a 3 day embryonic Precis coenia Agt 10 cDNA library 
(generously provided by Sean CarroU) was screened. Filters were hybridized at 65" C overnight 
30 in a solution containing 5xSSC, 10% dextran sulfate, 5x Denhardt's, 200 pg/ml sonicated 
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5 salmon sperm DNA, and 0.5% SDS. FUters were washed in O.IX SSC, 0.1% SDS at room 
temperature several times to remove nonspecific hybridization. Of the 1 00,000 plaques initially 
screened, 2 overlapping dones, li and L2, were isolated, which corresponded to the N tenninus 
of butterfly/ifc. Using L2 as a prober the library filters were rescrccned and 3 additional dones 
(L5, L7, L8) were isolated which encompassed the remainder of the ptc coding sequence. The 
10 full length sequence of butterfly ptc (SEQ ID N0:3) was determined by ABI automated 
sequencing. 

Screen of a THbolium (beetle) Genomic Library with Mosquito PCR Product and 900 
bp Fragment from the Butterfly Clone. A A,geml 1 genomic library from Tribolium casteneum 
(gift of Rob Denndl) was probed with a mixture of the mosquito PCR (SEQ ID N0:7) product 
15 and BstXI/EcoRI fragment of L2. Filters were hybridized at 55» C overnight and washed as 
above. Of the 75,000 plaques screened, 14 dones were identified and the Sad fi-agment of T8 
(SEQ ID N0:1), w*ich aosshybridized with the mosquito and butterfly probes, was subdoned 
into pBluescript. 

PCR on Mouse cDNA Using Degenerate Primers Derived from Regions Conxrved in 
20 the Four Insect Homologues. Two degenerate PCR primers (P4REV- (SEQ ID NO: 16) 
GCrACgAATTCYTNGANTGYTTYTGGGA- P22- (SEQ ID NO. 1 7) CATACCAnrCAAf; 
CUfiTCIGGCCARTGCAT) were designed based on a comparison of ptc amino acid 
sequences from fly (Drost^hila melanogaster) (SEQ ID N0:6), mosquito (Anopheles gambiae) 
(SEQ ID N0:8), butterfly {Precis coenia) (SEQ ID N0:4), and beetle (Tribolium casteneum) 
25 (SEQIDN0:2). I rqjresents inoane, which can form base pairs with all four nucleotides. P22 
was used to reverse transcribe RNA from 12.5 dpc mouse limb bud (gift from David Kingsley) 
for 90 min at 37' C. PCR using P4REV (SEQ ID NO: 17) and P22 (SEQ ID NO: 1 8) was then 
performed on 1 /zl of the resultant cDNA under the following conditions: 
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5 9A'C 4 min.; 72'C Add Taq; 

[94 "C 15 sec.- 50 "C 30 sec.- 72 "C 90 sec.] 35 times 
72 "C 10 min.-, 4 "C hold 

PGR products of the expected size were subcloned into the TA vector (Invitrogen) 
10 and sequenced with the Sequenase Version 2.0 DNA Sequencing Kit (U. S. B.). 

Using the cloned mouse PGR fragment as a probe, 300.000 plaques of a mouse 8.5 dpc 
XgtlO cDNA library (a gift from Brigid Hogan) were screened at 65' C as above and washed 
in 2x SSC, 0. 1% SDS at room temperature. 7 clones were isolated, and three (M2, M4, and 
M8) were subcloned into pBluescript II. 200,000 plaques of this library were rescreened using 
15 first, a 1.1 kb EcoRI fragment from M2 to identify 6 clones (M9-M16) and secondly a mixed 
probe containing the most N tenninal (Xhol fragment from M2) and most C terminal sequences 
(BamHI/Bgffl fragment from M9) to isolate 5 clones (M17-M21). M9, MIO, M14, and M17. 
21 were subcloned into the EcoRI site of pBluescript II (Strategene). 

RNA Blots and in situ Hybridizations in Whole and Sectioned Mouse Embryos: 
20 Northerns. A mouse embryonic Northern blot and an adult multiple tissue Northern blot 

(obtained from Clontech) were probed with a 900 bp EcoRI fragment from an N terminal coding 
region of mouse /7/c. Hybridization was performed at 65" C in 5x SSPE, lOx Denhardt's, 100 
jig/ml sonicated salmon sperm DNA. and 2% SDS. After several short room temperature 
washes in 2x SSC, 0.05% SDS, the blots were washed at high stringency in 0. I X SSC, 0. 1% 
25 SDS at 50* C. 

In situ hybridization of sections: 7.75, 8.5. 11.5. and 13.5 dpc mouse embryos were 
dissected in PBS and frozen in Tissue-Tek medium at -80° C. 12- 1 6 nm frozen sections were 
ttit, coUected onto VectaBond (Vector Laboratories) coated slides, and dried for 30-60 minutes 
at room temperature. After a 10 minute fixation in 4% paraformaldehyde in PBS, the slides 
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5 were washed 3 times for 3 minutes in PBS, acetylated for 10 minutes in 0.25% acetic anhydride 
in triethanolamine, and washed three more times for 5 minutes in PBS. Prehybridization (50% 
• fonnamide, 5X SSC, 250 fig/ml yeast tRNA, 500 ng/ml sonicated salmon spem DNA, and 5x 
Denhardt's) was carried out for 6 hours at room temperature in 50% formamide/5x SSC 
humidified chambers. The probe, which consisted of 1 kb from the N-terminus ofp/c, was 

10 added at a concentration of 200-1000 ng/ml into the same solution used for prehybridization, 
and then denatured for five minutes at 80* C. Approximately 75 yd of probe were added to 
each slide and covered with Parafilm. The slides were incubated overnight at 65' C in the same 
humidified chamber used previously. The following day, the probe was washed successively in 
5X SSC (5 minutes, 65^ C), 0.2X SSC (1 hour. 65*^ C), and 0.2X SSC (10 minutes, room 

15 temperature). After five minutes in buffer Bl (O.IM maleic acid, 0.15 M NaCl, pH 7.5), the 
slides were blocked for 1 hour at room temperature in 1% blocking reagent (Boerhinger- 
Mannheim) in buffer Bl, and then incubated for 4 hours in buffer Bl containing the DIG*AP 
conjugated antibody (Boerhinger-Mannheim) at a 1:5000 dilution. Excess antibody was 
removed during two 15 minute washes in buffer Bl, followed by five minutes in buffer B3 (100 

20 naM Tris, lOOmM NaCl, 5mM MgCIj, pH 9.5). The antibody was detected by adding an alkaline 
phosphatase substrate (350 fil 75 mg/ml X-phosphate in DMF, 450 50 mg/ml NBT in 70% 
DMF in 100 mis of buffer B3) and allowing the reaction to proceed overnight in the dark. After 
a brief rinse in 10 mM Tris, ImM EDTA, pH 8.0, the slides were mounted with Aquamount 
(Lemer Laboratories). 

25 Drosophila S-transcriptional initiation region fl-gal constructs. A series of constructs 

were designed that link different regions of the ptc promoter from Drosophila t a LacZ 
rqsorter gene in order to study the cis regulation of the ptc expression pattern. See Fig. 1. A 
lO.Skb BamHI/BspMl Segment comprising the 5 -non-coding region of the MRNA at its 3 - 
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5 tenniiius was obtained and tnincated by restriction enzyme digestion as shown in Fig. I. These 
e>q)ression cassettes were introduced into Drosophila lines using a P-element vector (Thunund 
et aL (1 988) IjfiDfeL74:445-456), which were injected into embryos, providing flies which could 
be grown to produce embiyos. (See Spradling and Rubin (1982) Scienee 218:341-347 for a 
description of the procedure.) The vector used a pUC8 background into which was introduced 
10 the white gene to provide for yeUow eyes, portions of the P-element for integration, and the 
constnicts were inserted into a polylinker upstream from the LacZ gene. The resulting embryos, 
larvae, and adults were stamed using antibodies to LacZ protein conjugated to HRP and the 
samples developed with OPD dye to identify the expression of the LacZ gene. The staining 
pattOTi in embryos is described in Fig. 1, indicating whether there was staining during the early 
15 and late development of the embryo. 

Isolation ofaMouseptc Gene. Homologues of fly ptc (SEQ ID NO:6) were isolated 
from three insects: mosquito, butterfly and beetle, using either PCR or low stringency library 
screens. PCR primers to sbc amino acid stretches o^ptc of low mutatabHity and degeneracy 
were designed. One primer pair, P2 and P4, amplified an homologous fragment of ptc from 
20 mosquito genomic DNA that corresponded to the first hydrophilic loop of the protein. The 
345bp PCR product (SEQ ID N0:7) was subcloned and sequenced and when aligned to Hyptc, 
showed 67% amino acid identi^. 

nie dorwd mosquito fragment was used to screen a butterfly Agt 1 0 cDNA library. Of 
100.000 plaques screened, five overiapping clones were isolated and used to obtain tiie fiiU 
25 length coding sequence. The butterflyp/c homologue (SEQ ID NO:4) is 1,3 1 1 amino acids long 
and overall has 50% amino acid identity (72% similarity) to fly ptc. With the exception of a 
divergent C-terminus, this homology is evenly spread across the coding sequence. The 
mosquit PCR clone (SEQ ID N0:7) and a corresponding fragment of butterfly cDNA were 
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5 used to screen a beetle Agemll genomic library. Of the plaques screened, 14 clones were 
identified. A fi'agment of one clone (T8), which hybridized with the original probes, was 
subcloned and sequenced. This 3kb piece contains an 89 amino acid exon (SEQ ID N0:2) 
which is 44% and 51% identical to the corresponding regions of fly and butterfly ptc 
respectively. 

10 Using an alignment of the four insect homologues in the first hydrophillc loop of the p/c, 

two PGR primers were designed to a five and six amino acid stretch which were identical and 
of low degeneracy. These primers were used to isolate the mouse homologue using RT-PCR 
on embryonic limb bud RNA. An appropriately sized band was amplified and upon cloning and 
sequencing, it was found to encode a protein 65% identical to fly ptc. Using the cloned PGR 

15 product and subsequently, fi-agments of mouse pic cDNA, a mouse embryonic XcDNA library 
was screened. From about 300,000 plaques, 17 clones were identified and of these, 7 form 
overlapping cDNA's that comprise most of the protein-coding sequence (SEQ ID NO:9) . 

Developmental and Tissue Distribution of Mouse ptc RNA, In both the embryonic and 
adult Northern blots, the pic probe detects a single 8kb message. Further exposure does not 

20 reveal any additional minor bands. Devdopmentally, ptc mRNA is present in low levels as early 
as 7 dpc and becomes quite abundant by 1 1 and 15 dpc. While the gene is still present at 17 
dpc, the Northern blot indicates a clear decrease in the amount of message at this stage. In the 
adult, ptc RNA is present in high amounts in the brain and lung, as well as in moderate amounts 
in the kidney and liver Weak signals are detected in heart, spleen, skeletal muscle, and testes. 

25 In situ Hybridization of Mouse ptc in Whole and Section Embryos. Northern analysis 

indicates that ptc mRMA is present at 7 dpc, while there is no detectable signal in sections fi'om 
7.75 dpc embryos. This discrepancy is explained by the low level of transcription. In contrast, 
ptc is present at high levels along the neural axis of 8.5 dpc embryos. By II .5 dpc, ptc can be 
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5 detected in the developing lung buds and gut, consistent vwth its adult Northern profile. In 
addition, the gene is preseit at high levels in the ventricular zone of the central nervous system, 
as well as in the zona limitans of the prosencephalon, pic is also strongly transcribed in the 
condensing cartilage of 1 1.5 and 13.5 dpc limb buds, as well as in the ventral portion of the 
somites, a region which is prospective sclerotome and eventually forms bone in the vertebral 
10 column, ptc is present in a wide range of tissues from endodermal, mesodermal and ectodermal 
origin supporting its fundamental role in embryonic development. 

Isolation of the Human ptc Gene. To isolate human ptc (hptc), 2x10^ plaques from a 
human lung cDNA Ubrary (HL3022a, Clonetech) were screened with a Ikbp mouse pic 
fragment, M2-2. Filters were hybridized overnight at reduced stringency (60" C in 5X SSC, 
1 5 10% dextran suifete. 5X Denhardfs, 0.2 mg/ml sonicated salmon sperm DNA, and 0.5% SDS). 
Two positive plaques (HI and H2) were isolated, the inserts cloned into pBluescript, and upon 
sequendng. both contained sequence highly similar to the mouse pic homolog. To isolate the 
5' end, an additional 6 x 10^ plaques were screened in duplicate with M2-3 EcoRI and M2-3 
Xho I (containing 5* untranslated sequence of mouse ptc) probes. Ten plaques were purified - 
20 and of these, inserts were subcloned into pBluescript. To obtain the full cod'mg sequence, H2 
was fuUy and H14, H20. and H21 were partially sequenced. The S.lkbp of human pic sequence 
(SEQ ID NO: 18) contains an open reading frame of 1447 amino acids (SEQ ID NO: 19) that 
is 96% identical and 98% similar to mouse pic. The 5* and 3' untranslated sequences of human 
pic (SEQ ID NO: 18) are also highly similar to mouse ptc (SEQ ID NO: 19) suggesting 
25 conserved regulatory sequence. 

ConiparisonofMouse, Human, Fly and Butterfly Sequences. Th deduced mouse 
ptc protein sequence (SEQ ID NO: 10) has about 38% identical amino adds to fly pic over about 
1,200 amino adds. This amount of conservation is dispersed through much of the protein 
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5 excepting the C-terminal region. The mouse protein also has a 50 amino acid insert relative to 
the fly protein. Based on the sequence conservation of ptc and the functional conservation of 
hedgehog between fly and mouse, one concludes that ptc fiinctions similarly in the two 
organisms. A comparison of the amino acid sequences of mouse (mptc) (SEQ ID NO: 10), 
human (hptc) (SEQ ID NO: 19), butterfly (b/7/c)(SEQ ID N0:4) and drosophih (ptc) (SEQ ID 

10 NO:6) is shown in Table 1 . 

TABLE 1 

ALIGNMENT OF HUMAN, MOUSE, FLY, AND BUTTERFLY P7U HOMOLOGS 



HPTC MASAGNAAEPQDR — CGGGSGCICAPCRPAGGGRimRTGGLRRAAAPDRDyLHRPSyCDA 

MPTC MASACNAA GALGRQAGGGRRRRTGGPHRA-APDRDyLHRPSYCDA 

15 PTC M DRDSLPRVPDTHGD — WDE KLFSOL YI-RTSWVDA 

BPTC MVAPDSEAPSNPRITAAHESPCATEA RHSADL YI-RTSWVDA 

* ... 

HPTC AF ALEQI SKGKATGRKAPLWLRAKFQRLLFKLGCY IQKNCG KFL WGLLI FGAFAVCLKA 
20 MPTC AFALEQISKCKATGRKAPLWUUUOPQRIXFKMCYIQKNCGKFLWGIXIFGAFAVGIJ^ 
PTC QVALDQIDKGKARGSRTAIYLRSVFQSHLETLGSSVQKHAGKVLFVAILVLSTFCVGLKS 
BPTC ALALSELEKGNIECGRTSLWIRAWLQEQLFILGCFLQGDAGKVLFVAILVLSTFCVGLKS 
** **. * *. * ** * , ** * ***♦. 

25 HPTC ANLETNVEELWVEVCGRVSRELNYTRQKIGEEAMFNPQLMIQTPKEECANVLTTEALLQH 
MPTC ANLETNVEELWVEVGGRVSRELNYTRQKIGEEAMFNPQLMIQTPKEEGANVLTTEALLQH 
PTC AQIHSKVHQLWIQEGGRLEAEIAYTQKTIGEDESATHQIXIQTTHDPNASVLHPQALLAH 
BPTC AQIHTRVDQLWVQEGGRLEAELKYTAQALGEADS STHQLVIQTAKDPDVSLLHPGALLEH 
*..**.. ***. *» ** . .** ***** . . * *** * 



30 



35 



HPTC LDSALQASRVHVYMYNRQWKLEHLCYKSGELITET-CYMDQIIEyLYPCLIITPLDCFWB 

MPTC LDSALQASRVHVYHYNRQWKLEHLCYKSGELITET-GYMDQI lEYLYPCLIITPLDCFWE 

PTC IJBVLVKATAVKVHLYDTEWGLRDMCNMPSTPSFEGIYYIEQILRHLIPCSIITPLDCFWE 

BPTC LKWHAATRVTVHMYDIEWRLKDLCYSPSIPDFEGYHHIESIIDNVIPCAIITPLDCFWE 



HPTC GAKLQSGTAYLLGKPPLR WTNFDPLEPLEELK KINYQVDSWEEMLNKABV 

HPTC GAKLQSGTAYLLGKPPLR WTNFDPLEFLEELK KINYQVDSWEEMLNKABV 

PTC GSQI*L-GPESAWIPGLNQRLLWTTLNPASVMQYMKQKMSEEKISPDFETVEQYMKRAAI 

40 BPTC CSKLL-GPDYPIYVPHLKHKLQWTHLNPLEWEEVK-KL KFQFPLSTIEAYMKRAGI 

*..* * * * * *. . . ♦ 



HPTC GHGYMDRPCLNPADPDCPATAPNKNSTKPLOMALVLNGCCHGLSRKYMHWQEELIVGCTV 

HPTC GHGYMDRPCLNPADPDCPATAPNKNSTKPLDVALVLNGGCQGLSRKYMHWQEELIVGGTV 

45 PTC GSGYMEKPCLKPLNPNCPOTAPNKNSTQPPDVGAILSGGCYGYAAKHMHWPEELXVCGRK 

BPTC TSAYHKKPCLDPTDPHCPATAPNKKSGHIPDVAAELSHGCYGFAAAYMHWPEQLIVGGAT 
,***^* ,*^** **«**^* *^ ** * , •** *.«*•** 

HPTC KNSTGKLVSAHALQTMFQLMTPKQMYEHFKGYEYVSHINWNEDKAAAILEAWQRTYVEW 

MPTC KNATGKLVSAHALQTMFQLMTPKQMYEHFRGYDYVSHIKWNEDRAAAILEAWQRTYVEW 
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5 PTC IWRSCHIJUUVQALQSWQIJITEKEMYDQWQDNYKVra 

BFTC RNSTSAUISAIUU^TWQMGEREMYEVWADHYKVHQICWNQEKAAAVMAWQR^ 

* *** . * * . * 

HPTC HQSVAQNSTQR VLSFTTTTLDDILKSPSDVSVIRVASCYLLMLAVACLTMLIIW-DC 

10 MPTC HQSVAPNSTQR VLPFTTTTIJ)DIiatSFSDVSVIRVASGyLIJflAyACLTMLRW-DC 

PTC EQIXRKQSRIATNYDIYVFSSAALDDIIAKPSHPSALSIVIGVAVTVLYAFCTIXRWraP 

BPTC RKI-TTSCSSVSSAYSryPFSTSTLNDILGKFSEVSLKNI ILGYMFMLIYVAVTLIQWRDP 

••••.*•*** **• * . * . • * 

IS HPTC SKSQOAVCIACVIXVAI^VAACWLCSLIGISPNAATTQVLPPIJa^VOVDDVPIJ^AHAr 

MPTC SKSQGAVOIAGVIXVALSVAACW2LCSLIGISFNAATTQ\^PriJa/5VGVDDVPUJUI^ 

PTC VRGQSSVCVACVLIitCPSTAACMI^AIJ^IVFNAASTQVVPriAI/SW 

BPTC IRS0AGV0IAGVIJJLSITVAAGMFCAIJ:/5IPFNASSTQIVPFIJa.GI^VQDMF^ 

20 

HPTC SETGQNKRIPFEDRTCECLKRTCASVALTSISMVTAFFMAALIPIPALRAFSLQAAV\A^ 

MPTC SETGQNKRIPFEDRTCECLKRTGASVALTSISMVTAFFMAALIPIPAIJIAFSLQAAVVVV 

PTC AESN RREQTKLILKKVCPSILFSACSTAGSFPAAAFIPVPALKVFCLQAAIVMC 

BPTC VEQAGD — WREERTGLVIJCKSCLSVLIJlSLCmmAFUUUVIXPIPAFRVPCLQ 

HPTC FNFAMVU.IFPAILSMDLYRREDRRI.OirCCFTSPCVSRVIQVEPQAYTDTHDHTRySPP 

MPTC FNFAMVLLIPPAILSMDLYRPEDRRLDIFCCFTSPCVSRVIQVEPQAYTEPHSNTRYSPP 

PTC SKIAAAIXVFPAMISU)UUmTAGRADIFCCCF-PVWKEQPKVAPPVLPLmJNNGR — 

30 BPTC FNLGSILLVFPAMISLDLRRRSAAPADLLCCLM-P ESP LPKKKIPER 

HPTC PPYSSHSFAHETQITMQSTVQLRTEYDPHTHVYYTTAEPRSEISVQPVTVTQDT LSCQSP 

MPTC PPYTSHSPAHETHITMQSTVQLRTEYDPHTHVyYTTAEPRSEISVQPVTVTQDMLSCQSP 

35 PTC GARHPKSCNNNRVPLPAQMPI.LEQPA 

BPTC AKTRKNDKTHRID-TTRQPLDPDVS 

HPTC ESTSSTRDIXSQrSDSSLHCLEPPCTKWTLSSFAEKHYAPFIXKPKAK\^IFLFIXJLW 

40 MPTC ESTSSTRDlXSQFSDSSIilCLEPPCTKWTLSSFAEKHYAPFIJJCPKAKVWIIXFLCLLG 

PTC OIPGSS HSLASP SLATFAFQHYTPFLMRSWVKFLTVMGFLAALI 

BPTC ENVTKT CCL-SV SLTKWAKNQYAPPIMRPAVKVTSMLALIAVIL 

45 HPTC VSLYGTTRVRDGLDLTDIVPRETREYDFIAAQFKYFSFYNMYIVTQKA-DYPNIQHLLYD 

PTC SSLYASTRI.QDGU)IIDLVPKDSNEHKFIJ5AQTRLFGFYSMYAVTQGNFEYPTQQQI.I^ 

BPTC TSVWGATrVjsjjGMLTDIVPENTDEHEFLSRQEKYFGFYNMYAVTQGNFEYPTNQKLLYE 

HPTC Iin^PSMVKYVMLEENKQLPKMVaHYFRDWI^LQDAET)SDWETGKIM^ 

50 MPTC I^CSFSKVKYVMLEENKQLPQMWUIYFRDWLWLQDAFDSDWETGRIMPNN-YKNGSDDG 

PTC YHDSFVRVPHVIKNDNCGLPDFWIJXFSEWMNI^KIFDEEYRDGRLTKECWFPNASSD^ 

BPTC YHDQFVRIPNI IKNDNGGLTKFWLSLFRDWLLDLQVAFDKEVASGCITQEYWCKNASDEG 

55 HPTC VLAYKLLVQTGSRDKPrDISQI.TK-QRI,VDADGriNPSAFYIYLTAWVSNDPVAYAASQA 

MPTC VLA5fKLLVQTGSRDKPI0rSQLTK-QRLVDADGIINPSAFYrYLTAWVSNDPVAYAASQA 

PTC ILAYKLIVQTGHVDNPVDKELVLT-NRLVNSDGIINQRAFYNYLSAWATNDVFAYGASQG 

BPTC IXAYKIJCVQTGHVDNPIDKSLITAGHRLVDKDGIINPKAFYMYLSAWATNDAIAYGASQG 

60 
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5 HPTC NIRPHRPBWVHDKADYMPETRIJIIPAAEPIEYAQFPFYI^GLRDTSDPVEAIEKVRTICS 

MPTC NIRPHRPEWVHDKADYMPETRLRI PAAEPIE YAQFPrYLNCLRDTSOPVEAIEKVRVlCN 

PTC KLYPEPRQYFHQPKSY DLKIPKSLPLVYAQMPFYLHGLTDTSQIKTLICHIRDLSV 

BPTC NLKPQPQRWIHSPBDV HLEIKKSSPt.lYTQLPFYLSGLSDTDSIKTLIRSVRDLCI. 

10 

HPTC MVTSLGLSSYPMCYPFLFWEQYICLPHWLIXFISVVIACTFLVCAVPLLMPWTAGIIVKV 

MPTC MYTSMLSSYPNGYPFLFWEQYISLRHWLIXSIS\n^CTFLVCAVPIJiIPWTAGIIVW 

PTC CTEGFGLPNYPSGIPFIFWEQYMTLRSSU^IUVCVLIJUaVLVSIXIXSW 

BPTC KYEAKGLPNFPSGIPFLFWEQYLYLRTSIXLAIACALGAVFIAVKVIJJJJAWA^ 

HPTC lALMTVELFGMMCLIGIKLSAVPWILIASVGIGVEFTVHVAIAFLTAIGDKNRRAVI^ 

MPTC LAZjnVEU^CMMGLIGIKLSAVPWILIASVGIGVEFTVHVAIAFLTAIGDKNHRAMIJ^ 

PTC VLASLAQIFGAMTLLGIKLSAIPAVILILSVGMMI.CFNVLISLGFHTSVGMRQRRVQLSM 

20 BPTC lATLVLQIXGVMALIXSVKLSAMPPVLLVIAIGRGVHFTVHLCLGPVTSIGCKRR^ 

HPTC EHMPAPViaMSAVSTLMVIJlIAGSEFDFXVRyFFAVrAlLTILGVLNGLVLLI^^ 

MPTC EHMFAPVLDGAVSTLlXiVUOAGSEFDFIVRYFFAVIAILTVLGVLNGLVLLPVLLSFFG 

25 PTC QMSLGPLVHGMLTSGVAVFMLSTSPFEFVIPHFCWLLLWLCVGACNSLLVFPILLSMVG 

BPTC ESVIAPVVHGAIJUUUAASMIA. ASEFGFVARLFLRLLLALVFUJLIDGLLFFPIVLSIM 

HPTC PYPEVSPANCLNRLPTPSPEPPPSWRFAMPPGHTHSGSDSSDSEYSSQTTVSGLSE-EL 

30 MPTC PCPEVSPANCLNRLPTPSPEPPPSWRPAVPPGHTNNCSDSSDSEYSSQTTVSGISE-EL 

PTC PEAELVPLEHPDRISTPSPLPVRSSKRSGKSYWQGSRSSRGSCQKSHHHHHKDLNDPSL 

BPTC PAAEVRPIEHPERLSTPSPKCSPIHPRKSSSSSGGGDKSSRTS — KSAPRPC APSL 

35 HPTC RHYEAQQGACGPAHQVIVEATENPVFAHSTWHPESRHHPPSNPRQQPHLDSGSLPPGRQ 

MPTC RQYEAQQCAGGPAHQVIVEATENPVPARSTWHPDSPHQPPLTPRQQPHLDSGSLSPGRQ 

PTC TTITEEPQSWKSSNSSIQMPNDWTYQPREQ — RPASYAAPPPAYHKAAAQQHHQHQGPPT 

BPTC TTITEEPSSWHSSAHSVQSSMQSrWQPEWVETTTYNGSDSASGRSTPTKSSHGGAITT 

40 

HPTC GQQPRRDPPREGLWPPLYRPRRDAFEISTEGHSGPSNRARWCPRGARSHNPPNPASTAMG 

MPTC GQQPRRDPPREGLRPPPYRPRRDAFEISTEGHSGPSNRDRSGPRGARSHMPRNPTSTAMG 

PTC TPPPPFPTA YPPELQS I WQPEVTVETTHS DS 

BPTC TKVTATANIKVEWTPSDRKSRRSYHYYDRRRDRDEDRDRDRERJDRDRDRDRDRDRDRDR 

45 

HPTC SSVPGYCQPITTVTASASVTVAVHPPPVPGPGRKPRGGLCPGY PETDHGLFEOPHVP 

MPTC SSVPSYCQPITTVTASASVTVAVHPP — PGPGRNPRCGPCPGYESYPETDHGVFEOPHVP 

PTC NT TKVTATAMIKVELAMP GPAVRS YNPTS 

50 BPTC DR DRERSRERDRP.ORYRD EPDHPA SPRENGRDSGHE 



HPTC FHVRCERRDSKVEVIELQDVECEERPRGSSSN 
MPTC FHVRCERRDSKVEVIELQDVECEERPWGSSSN 

55 PTC 

BPTC ' SDSSRH 

The identity often other clones recovered from the mouse library is not determined. 
These cDNAs cross-hybridize with mouse ptc sequence, while differing as to their restriction 
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5 maps. These genes encode a family of proteins related to the patched protein. Alignment of the 
human and mouse nucleotide sequences, which indudes coding and noncoding sequence, reveals 
89% identity. 

Radiation hybrid mapping of the human pic gene. Oligonucleotide primers and 
conditions for specifically amplifying a portion of the human ptc gene fi-om genomic DNA by 
10 the polymerase chain reaction w^e developed. This marker was designated STS SHGC-8725. 
It generates an amplification product of 196 bp, which is observed by agarose gel 
electrophoresis when o human DNA is used as a template, but not when rodent DNA is used. 
Samples were scored in duplicate for the presence or absence of the 196 bp product in 83 
radiation hybrid DNA samples from the Stanford G3 Radiation Hybrid Panel (purchased fi-om 
1 5 Research Genetics, Inc.) By comparison of the pattern of G3 panel scores for those with a scries 
of Genethon meiotic linkage 5 markers, it was determined that the human ptc gene had a two 
point lod score of 1,000 with the meiotic marker D9S287, based on no radiation breaks being 
observed between the gene and the marker in 83 hybrid cell lines. These results indicate that 
the/7/c gene lies within 50-100 kb of the marker. Subsequent physical mapping in YAC and 
20 BAG clones confirmed this dose linkage estimate. Detailed map information can be obtained 
fi'om http://www.shgc.stanford.edu. 

Analysis o/BCNS mutations The basal cell nevus syndrome has been mapped to the 
same r^on of chromosome 9q as was found for ptc. An initial screen of EcoRl digested DNA 
fi-om probands of 84 BCNS kindreds did not reveal major rearrangements of the ptc gene, and 
25 so screaiing was performed for more subtle sequence abnormalities. Using vectorette PGR, by 
the method according to Riley e/ aL (1990) N.A R . 18:2887-2890, on a BAG that contains 
genomic DNA for the entire coding region of ptc, the intronic sequence flanking 20 of the 24 
exons was determined. Single strand conformational polymorphism analysis of PGR-amplified 
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S DNA from normal individuals, BCNS o patients and sporadic basal cell carcbomas (BCC) was 
performed for 20 exons otptc coding sequence. The amplified samples giving abnormal bands 
on SSCP were then sequenced. 

In biood cell DNA from BCNS individuals, four independent sequence changes were 
found; two in exon IS and two in exon 1 0. One 49 year old man was found to have a sequence 
10 change in exon IS. His affected sister and daughter have the same alteration, but three 
unafflicted relatives do not. His blood cell DNA has an insertion of 9 base pairs at nucleotide 
2445 of the coding sequence, resulting in the insertion of three amino acids (PNI) after amino 
add 815. Because the normal sequence preceding the insertion is also PNI, a direct repeat has 
been formed. 

15 The second case of an exon 15 change is an 18 year old woman who developed jaw 

cysts at age 9 and BCCs at age 6. The developmental effects together with the BCCs indicate 
that she has BCNS, although none of her relatives are known to have the syndrome. Her blood 
cell DNA has a deletion of 1 1 bp, removing the sequence ATATCC AGC AC at nucleotides 244 1 
to 2452 of the coding sequence. In addition, nucleotide 2452 is changed from a T to an A. The 

20 deletion results in a frameshift that is predicted to truncate the protein after amino acid 813 with 
the addition of 9 amino acids. The predicted mutant protein is truncated afler the seventh 
transmembrane domain. In Drosophila, a ptc protein that is truncated after the sbcth 
transmembrane domain is inactive when ectopically expressed, in contrast to the fiilMength 
protein, suggesting that the human protein is inactivated by the exon 15 sequence change. The 

25 patient with this mutation is the first affected family member, since her parents, age 48 and 50, 
have neither BCCs nor other signs of the BCNS- DNA from both parents' genes have the normal 
nucleotide sequence for exon 15, indicating that the alteration in exon 15 arose in the same 
generation as did the BCNS phenoty]:>e. Hence her disease is the result of a new mutation. This 
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5 sequence change is not detected in 84 control chromosomes. 

Analysis of sporadic basal cell carcinomas. T determine whether ptc is also 
involved in BCCs that are not associated with the BCNS or germline changes, DNA was 
examined from 12 ^radic BCCS. Three alterations were found in these tumors. In one tumor, 
a C to T transition in exon 3 at nucleotide 523 of the coding sequence changes a highly 
10 conserved laidne to phenylalanine at residue 175 in the first putative extracellular loop domain 
Blood cell DNA from the same individual does not have the alteration, suggesting that it arose 
somaticafly in the tumor. SSCP was used to examine exon 3 DNA from 60 individuals who do 
not have BCNS, and found no changes from the normal sequence. Two other sporadic BCCs 
have deletions o encompassing exon 9 but not extending to exon 8. 
15 The existence of sporadic and hereditary forms of BCCs is reminiscent of the 

characteristics of the two forms of retinoblastoma. This parallel, and the frequent deletion in 
tumors of the copy of chromosome 9q predicted by linkage to cany the wild-type allele, 
dCTionstrates that the human pre is a tumor suppressor gene, ptc represses a variety of genes, 
including growth factors, during Drosophila development and may have the same effect in 
20 human skin. The ofren reported large body size of BCNS patients also could be due to reduced 
ptc fimction, perhaps due to loss of control of growth factors. The C to T transition identified 
in ptc in the sporadic BCC is also a common genetic change in the p53 gene in BCC and is 
consistent with the role of sunlight in causing these tumors. By contrast, the inherited deletion 
and insertion mutations identified in BCNS patients, as expected, are not those characteristic 
25 of ultraviolet mutagenesis. 

The identification of the ptc mutations as a cause of BCNS links a large body of 
developmental genetic information to this important human disease. In embryos lacking ptc 
function part of each body segment is transformed into an anterior-posterior mirror-image 
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5 duplication of another pan. The patterning changes in ptc mutants are due in part to 
derepression of another segment polarity gene, wingless, a homolog of the vertebrate Wnt genes 
that encodes secreted signaling proteins. In normal embryonic development, ptc repression of 
wg \s relieved by the Hh signaling protein, which emanates from adjacent cells in the posterior 
part of each segment. The resulting localized wg expression in each segment primordium 

10 organizes the pattern ofbristies on the surface of the animal. The /^/c gene inactivates its own 
transcription, while Hh signaling induces ptc transcription. 

In flies two other proteins work together with Hh to activate target genes: the ser/thr 
kinase fused and the zinc finger protein encoded by cubitus interruptus. Negative regulators 
working together with ptc to repress targets are protein kinase A and costal!. Thus, mutations 

15 that inactivate human versions of protein kinase A or costal!, or that cause excessive activity 
of human hh, gli, or a Jiised homolog, may modify the BCNS phenotype and be important in 
tumorigenesis. 

In accordance with the subject invention, mammalian patched genes, including the 
mouse and human genes, are provided, which can serve many purposes. Mutations in the gene 

20 are found in patients with basal cell nevus syndrome, and in sporadic basal cell carcinomas. The 
autosomal dominant inheritance of BCNS indicates that patched is a tumor suppressor gene. 
The patched protein may be used in a screening for agonists and antagonists, and for assaying 
for the transcription of ptc mRNA. The protein or fragments thereof may be used to produce 
antibodies specific for the protein or specific epitopes of the protein. In addition, the gene nuy 

25 be employed for investigating embryonic development, by screening fetal tissue, preparing 
transgenic animals to serve as models, and the like. 

As described above, patients with basal cell nevus syndrome have a high inddence of 
multiple basal cell carcinomas, medulloblastomas, and meningiomas. Because somatic ptc 
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5 mutations have been found in sporadic basal cell carcinomas, we have screened for ptc 
mutations in several types of sporadic extracutaneous tumors. We found that 2 of 14 sporadic 
medulloblastomas bear somatic nonsense mutations in one copy of the gene and also deletion 
of tte other copy. In addition, we identified mis-sense mutations in ptc in two of seven breast 
carcinomas, one of nine meningiomas, and one colon cancer cell line. No ptc gene mutations 

10 were detected in 10 primary colon carcinomas and eighteen bladder carcinomas. 

BCNS' (OMIM #109400) is a rare autosomal dominant disease with diverse 
phenotypic abnormalities, both tumorous (BCCs, medulloblastomas, and meningiomas) and 
devdopmental (misshapen ribs, spina bifida occults, and skull abnormalities; Gorlin, R,J.(1987) 
Medicine 66:98-1 13). The BCNS gene was mapped to chromosome 9q22.3 by linkage analysis 

15 of BCNS families and by LOH analysis in sporadic BCCs (Gallani, M.R. et aL (1992) Cell 
69:111-117). LOH in sporadic medulloblastomas has been reported in the same chromosome 
region (Scbofield, D. eicd, (1995) Am J Pathol 146:472-480). Recently, the human homologue 
of the Dros(^ila patched (prCE) gene has been mapped to the BCNS region (Hahn, H. et al. . 
(1996) Cp// 85:841-851; Johnson, R.L. etal. (1996) Science 272:1668-1671; Gallani, M.R. et 

20 aL (1996) Nat Genet 14:78-81; Xie, J. etaL (1997) Genes Chromosomes Cancer 18;305-309), 
and mutations in this gene have been found in the blood DNA of BCNS patients and in the DNA 
of sporadic BCCs (Hahn, H. et aL, supra; Johnson, R,L. et aL, supra; Gallani, M.R, et al., 
suprtr, and Chidambaram, A. etaL (1996) Cancer Res 36:4599-4601). ptc appears to fiinction 
as a tumor suppressor gene; inactivation abrogates its normal inhibition of the hedgehog 

25 signaling pathway. Because ofthe wide variety oftumors in patents with the BCNS and wide 
tissue distributi n of ptc gene expression, we have begun screening for ptc gene mutations in 
several types of human cancers, especially those present in increased numbers in BCNS patients 
(medulloblastomas), those in tissues derived embryologically from epidermis (breast carcinomas) 
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S and those Avith chromosome 9q LOG (bladder carcinomas; see Cairns, P. et al (1993) Cancer 
Res 53:1230-1232; and Sidransky, D. etal, (1997) NEJM 326:1^1-1401), 
Materials and methods 

Clinical Materials . Diagnoses of all tumors were confirmed histolo^cally. Cell lines 
were obtained fiom the America Type Culture Collection. DNA was extracted fi-om tumors or 

10 matched normal tissue (peripheral blood leukocytes or skin) as described (Cogen, P.H. et al. 
(1990) Genomics 8:279-285; and Sambrook, J. et al Molecular Cloning: A Laboratory 
Manual, Ed. 2, Vol. 2, pp. 9. 17 - 9.19, Cold Spring Harbor. NY (1989)). 

PGR and Heteroduplex Analysis , PCR amplification and heteroduplex/SSCP analysis 
were perfomned as described (Johnson, R.L. et al, supra; Spritz, R.A. et al (1992) Am J Hum 

1 5 Genet 5 1 : 1058-1065). Primers used and intron/exon boundary sequences of the ptc gene were 
derived as reported previously (Johnson, R.L. et ai, supra) and are shown in Table 1 . Primers 
for exon 1 and 2 were firom Hahn et al (supra). 

Sequence Analysis . Exon segments exhibiting bands were reamplified and were 
sequenced directly u^g the Sequenase sequencing kit according to the protocol recommended 

20 by the manufacturer (United States Biochemical Corp.). A second sequencing was performed 
using independently amplified PCR products to confirm the sequence change. The amplified 
PCR products firom each tumor were also cloned into the plasmid vector pCR 2. 1 (InVitrogen), 
followed by sequence analysis of at least four independent clones. The sequence alteration was 
confirmed fi'om at least two independent clones. Simplified amplification of specific allele 

25 analysis was performed according to Lei and Hall (Lei, X. and Hall, B.G. (1994) Biotechniques 
16:44-45). 

Allele Loss Analysis . Microsatellites used for allelic loss analysis were D9S109, 
DpSl 19, D9S127, D9S196, and D9S287 described in the CHLC human screening set (Research 



wo 97/45541 PCT/US97/09553 

-41- 

5 Genetics). A part of the ptc intron 1 sequence was tested for polymorphism in a control 
population and found to be polymorphic in 80% of the samples tested. This microsatellite was 
used for analysis of ptc gene allelic loss in bladder carcinomas. The primer sequences are as 
follows: forward primer, 5'-CTGAGCAGATTTCCCAGGTC-3'; and reverse primer, 5'- 
CCTCAGACAGACCTTTCCTC-3'. The PCR cycling for this newly isolated marker was 4 
10 mia at 95*C, followed by 30 cycles of 40 s at 95 X, 2 min. at 60X. and 1 min. at ll^'C. PCR 
products were separated on 6% polyacrylamide gels and exposed to fihn. 
Results and Discussion 

Intronic boundaries were determined for 22 exons of ptc by sequencing vcctorettc 
PCR products derived from BAC 192J22 (Johnson R.L., supra. Table 1). Our findings are in 
15 agreement with those of Hahn et al {supra\ expect that we find exon 12 is composed of 2 
separate exons of 126 and 1 19 nucleotides. This indicates that ptc is composed of 23 coding 
exons instead of 22. In addition, we find that exons 3, 4, 10, 1 1, 17, 21, and 23 diflFer slightly 
in size than reported previously (Hahn et ai, supra). Of 63 tumors studied, 14 were sporadic 
meduUoblastomas, and 9 were sporadic meningiomas. These 23 tumors were examined for 
20 allelic dd^ons by genotyping of tumor and blood DNA with microsatellite markers that flank 
Utitptc gene: D9S1 19, D9S196, D9S287, D9S127, and D9S109. Four of 14 meduUoblastomas 
had LOH Two of the meduUoblastomas, both of which had LOH, had mutations (med34 and 
med36; see Cogen, P.R etaL, supra), which are prediaed to result in truncated proteins (Table 
2). DNA samples fi-om the blood of these patients lack these mutations, indicating that they 
25 both are somatic mutations. med34 also has allelic loss on 17p (Cogen, P.H. et aL, supra). We 
were unable to detect ptc gene mutations by heteroduplex analyas in the other two 
medulloblastomas bearing LOH on 9q. The pathological features of these two tumors diflFered 
in that med34 belongs to the desmoplastic subtype, whereas med36 is of the classic type, 
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5 indicating thatp/c mutations in medulloblastomas are not restricted to a specific subtype. 
TABLE I Primers and boundary sequences ofPTCH 
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One report (Schofield, D. e/ has shown that five meduUobiastomas (two 

25 BCNS-assodated cases and three qroradic cases) bearing LOH on chromosome 9q22.3-q3 1 arc 
all of the desmoplasdc subtype, suggesting LOH on 9q22.3 is histological subtype specific. We 
fed that the conclusion derived firom only five positive tumors is a not strong one because we 
and others (Raffel, C. ei ai. (1997) Cancer Res 57:842-845) have found nondesmoplastic 
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5 subtypes of medulloblastomas bearing LOH on chromosome 9q22.3. Independently, another 
group has reported their finding of ptc mutations in sporadic meduUoblastomas (Rafifel, C. et 
al, supra). 

A change of T to C at nucleotide 2990 (in exon 18) was identified in DNA fi-om one 
of nine sporadic meningiomas, causing a predicted change of codon 997 from De to Thr (Table 

10 2). The meningioma bearing this mutation also has allelic loss on 9q22.3. Blood cell DNA is 
heterozygous for this mutation, but DNA from the tumor contains only the mutant sequence. 
Of 100 normal chromosomes examined, none has this sequence change, suggesting that this 
mutation is not likely a common polymorphism. This patient is 84 years old and has had no 
phawtypic abnormalities suggestive of the BCNS, suggesting that this sequence alteration may 

15 not have caused complete inactivation of the pic gene. None of the other eight meningiomas 
had detectable LOH at chromosome 9q. 



TABLE 2 PA TCHED s^ene alieraiionsf 
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We also examined a variety of other tumors (10 primary tumors and I cell line), 18 
bladder tumors (14 primary tumors and 4 cell lines), and 2 ovarian cancer cell lines. These 
30 tumors are not known to occur in higher than expected frequency in BCNS patients. We 
identified sequence abnormalities in two breast carcinomas and in the one colon cancer cell line 
(Table 2). The mutation found in breast carcinoma Br349 is not present in the patient's normal 
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5 sldn DNA, indicating that the sequence change is a somatic muution. Direct sequencing of the 
PCR product indicated that only the mutant allele is present in the tumor. This mutation 
changes codon 955 from Tyr to His, and this Tyr is conserved in human, murine, chicken, and 
ffypicU homologues (Goodrich, L.V. et ai (1996) Genes Dev 10:301.312). The mutation in 
breast carcinoma Br321 is predicted to change codon 995 from Glu to Gly, and the tumor with 

1 0 this mutation retains the wild-type allele. We have sequenced exon 1 8 in DNA from the blood 
of 50 normal person s and found no changes from the published sequence, suggesting that !he 
sequence change found in Br321 is not a common polymorphism. Furthermore, examination 
of the DNA from the cultured skin fibroblasts of the patient did not reveal the same mutation, 
indicating that this is a somatic mutation. 

1 5 Because DNA is not available from normal cells of the patient from which colon cell 

line 320 was established, we used simplified amplification of specific allele analysis (Lei, X. and 
Hall, B.G., supra) to examine 50 normal blood DNA samples for the presence of the sequence 
alteration and found none but the DNA from this cell line to have the mutant allele, suggesting 
that this mutation also is unlikely to be a common sequence polymorphism. For bladder 

20 carcinomas, a newly isolated miaosatdlite that was derived from intron 1 of the ptc gene was 
used to examine LOH in the tumor. Three primary bladder carcinomas showed LOH at this 
intragenic locus. With no ptc mutations detected in these tumors, we suspect that the LOH in 
these three bladder carcinomas may reflect the high incidence of while chromosome 9 loss in 
bladder cancers (Sidransky, D. et aL, supra), A similar observation has been reported 

25 previously (Simoneau, A R. etal. (1996) Cancer Res 56:5039-5043). 

We also detected a sequence change in intron 10 in two colon carcinomas, 15-1 and 
8-1, an alteration that was reported previously as a splicing mutation (Unden, AB. ei al (1996) 
Cancer Res 56:4562-4565). Because we found the same sequence change in about 20% of 
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5 lunnal ccHUrol samples, we suggest that this more rphism. The 

ptc protein is ]M'edicted to contain 12 transmembrane domains^ two large extracellular loops, and 
one intraceUuIar loop (Goodrich, L.V, et aL, supra). Of the six mutations we identified, four 
are misscnse mutations. Three mutations lead to amino acid substitutions in the second 
extracdiular loop, and one mutation results m an amino acid change in the intracellular domain. 
10 Our data indicate that somatic inactivation of the ptc gene does occur in some 

sporadic meduUoblastomas. In addition, because missense mutations of the ptc gene were 
detected in breast carcinomas, we suspect that defects of the ptc function also may be involved 
in some breast carcinomas, although biochemical evidence is necessary to show how these 
missense mutations might impair ptc function. Of 1 1 colon cancers and 18 bladder carcinomas 
15 examined, we found only one mutation in 1 colon cell line, suggesting that ptc gene mutations 
are relatively uncommon in clon and bladder cancers, although the incidence of chromosome 9 
loss in bladder cancers is high (Cairns, P. et aL, supra). 

Published reports of SSCP analysis of tumor DNA identified mutations in the ptc gene 
in only 30% of sporadic BCCs, although chromosome 9q22.3 LOH was reported in more than 
20 50% of these tumors (Gallani, M.R. etoL, supra). It has been reported that heteroduplex/SSCP 
analysis of gene mutations is more sensitive than SSCP analysis (Spritz, R. A. et aL, supra). In 
our studies, we were able to identify a point mutation in the 3 10-bp PGR product fi-om exon 1 5 
using heteroduplex analysis, whereas SSCP analysis failed to reveal this sequence change (Table 
2). Tho-efore, we suspea that there may be more mutations in BCCs than we have found thus 
25 far. Analysis of the/7/c gene in BCNS patients and in sporadic BCCs has identified mutations 
scattered widely across the gene, and the majority of mutations were predicted to resuU in 
truncated proteins (Hahn, H. et aL, supra; Johnson, R.L. et aL, supra; Gallani, M.R. et al., 
supra; Chidambaram, A etaL, supra, Unden, A.B. et aL, supra; Wicking, C. et aL (1997) Am 
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5 J Hum Genet 60:21-26). In our screening, we found two breast carcinomas bearing missense 
mutations of the/?/c gene. In one of these two tumors, B349, direct sequencing indicated a 
ddetion of the other copy of the/?/c gene. Any comparison of mutations in sldn cancers versus 
cxtracutaneous tumors must consider the wholly different causes of these mutations; UV light 
is unique to the sldn. 

10 All publications and patent applications cited in this specification arc herein 

incorporated by reference as if each individual publication or patent o application were 
specifically and individually indicated to be incorporated by reference. 

Although the foregoing invention has been described in some detail by way of 
illustration and example for purposes of clarity of understanding, it will be readily sqjparent to 

15 those of ordinary skill in the art in light of the teachings of this invention that certain changes 
and modifications may be made thereto without departing fi-om the spirit or scope of the 
appended claims. 
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SEQUENCE LISTING 



35 



40 



(1) GENERAL IKraRHHTION: 



(i) APPLICANT: SCOTT, MATTHEW P. 
10 GOODRICH, LISA V. 

JOHNSON, RONALD L. 

(ii) TITLE or INVENTION: Patched Genes end Their Use 
15 liii) NUMBER or SEQUENCES: 19 

(iv) COBRSSPONDSNCS ADDRESS: 

(A) ADDRESSEE: Poley, Hoag £ Eliot LLP 

(B) STREET: One Post 0£fice Square 
20 (C) CITY: Boston 

(D) STATE: MA 

(E) COUNTRX: US 
(r) ZIP: 02109 

25 (v) COMPUTER READABLE PORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC conpetible 

{C) OPERATING SYSTEM: FC-DOS/MS-DOS 

(D) SOFTHARE: Patentin Release #1.0, Version «1.30 

30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

<viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Vincent, Matthew P. 
<B) REGISTRATION NUMBER: 36,709 

(C) REFERENCE/DOdCET NUMBER: SUV003.26 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 617-632-1000 

(B) TELEFAX: 617-832-7000 

45 (2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 736 base pairs 
<B) T7PE: nuoleio acid 
50 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

AACNNCNNTN NATGGCACCC CCNCCCAACC TTTNNNCCNN NTAANCAAAA NNCCCCNTTT 60 
NATACCCCCT NTAANANTTT TCCACCNNNC NNAAANNCCN CTGNANACMA NGNAAANCCM 120 
TTTTTNAACC CCCCCCACCC GGAATTCCNA NTNNCCNCCC CCAAATTACA ACTCCAGNCC ISO 
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AAAATTNANA NAATTGGTCC TAACCTAACC NATNGTTGTT ACGGTTTCCC CCCCCAAATA 24 0 

CATGCACTGG CCCGAACACT TGATCGTTGC CGTTCCAATA AGAATAAATC TGGTCATATT 300 

AAACAAGCCN AAAGCTTTAC AAACTGTTGT ACAATTAATG GGCGAACACG AACTGTTCGA 360 

ATTCTGGTCT GGACATTACA AAGTGCACCA CATCGGATGG AACCAGGAGA AGGCCACAAC 42 0 

CGTACTGAAC GCCTGGCAGA AGAAGTTCGC ACAGGTTGGT GGTTGGCGCA AGGAGTAGAG 4 80 

TGAATGGTGG TAATTTTTGG TTGTTCCAGG AGGTGGATCG TCTGACGAAG AGCAAGAAGT 54 0 

CGTCGAATTA CATCTTCGTG ACGTTCTCCA CCGCCAATTT GAACAAGATG TTGAAGGAGG 600 

CGTCGAANAC GGACGTGGTG AAGCTGGGGG TGGTGCTGGG GGTGGCGGCG GTGTACGGGT 6 60 

GGGTGGCCCA GTCGGGGCTG GCTGCCTTGG GAGTGCTGGT CTTNGCGNGC TNCNATTCGC 720 

CCTATAGTNA GNCGTA 7 36 
i2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

Xaa Pro Pro Pro Asn Tyr Asn Ser Xaa Pro Lys Xaa Xaa Xaa Leu Val 
15 10 15 

Leu Thr Pro Xaa Val Val Thr Val Ser Pro Pro Lys Tyr Met His Trp 
20 25 30 

Pro Glu His Leu lie Val Ala Val Pro He Arg He Asn Leu Val He 
35 40 45 

Leu Asn Lys Pro Lys Ala Leu Gin Thr Val Val Gin Leu Met Giy Glu 
50 55 60 

His Glu Leu Phe Glu Phe Trp Ser Gly His Tyr Lys Val His His He 
€5 70 75 80 

Gly Trp Asn Gin Glu Lys Ala Thr Thr Val Leu Asn Ala Trp Gin Lys 
85 90 95 

Lys Phe Ala Gin Val Gly Gly Trp Arg Lys Glu 
100 105 

{2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5187 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 



GGGTCTGTCA 


CCCGGAGCCG 


GAGTCCCCGG 


CGGCCAGCAG 


CGTCCTCGCG 


AGCCGAGCGC 


60 


CCAGGCGCGC 


CCGGAGCCCG 


CGGCGGCGGC 


GGCAACATGG 


CCTCGGCTGG 


TAACGCCGCC 


120 


GGGGCCCTGG 


GCAGGCAGGC 


CGGCGGCGGG 


AGGCGCAGAC 


GGACCGGGGG 


ACCGCACCGC 


180 


GCCGCGCCGG 


ACCGGGACTA 


TCTGCACCGG 


CCCAGCTACT 


GCGACGCCGC 


CTTCGCTCTG 


240 


GAGCAGATTT 


CCAAGGGGAA 


GGCTACTGGC 


CGGAAAGCGC 


CGCTGTGGCT 


GAGAGCGAAG 


300 


TTTCAGAGAC 


TCTTATTTAA 


ACTGGGTTGT 


TACATTCAAA 


AGAACTGCGG 


CAAGTTTTTG 


360 


GTTGTGGGTC 


TCCTCATATT 


TGGGGCCTTC 


GCTGTGGGAT 


TAAAGGCAGC 


TAATCTCGAG 


420 


ACCAACGTGG 


AGGAGCTGTG 


GGTGGAAGTT 


GGTGGACGAG 


TGAGTCGAGA 


ATTAAATTAT 


480 


ACCCGTCAGA 


AGATAGGAGA 


AGAGGCTATG 


TTTAATCCTC 


AACTCATGAT 


ACAGACTCCA 


540 


AAAGAAGAAG 


GCGCTAATGT 


TCTGACCACA 


GAGGCTCTCC 


TGCAACACCT 


GGACTCAGCA 


600 


CTCCAGGCCA 


GTCGTGTGCA 


CGTCTACATG 


TATAACAGGC 


AATGGAAGTT 


GGAACATTTG 


660 


TGCTACAAAT 


CAGGGGAACT 


TATCACGGAG 


ACAGGTTACA 


TGGATCAGAT 


AATAGAATAC 


720 


CTTTACCCTT 


GCTTAATCAT 


TACACCTTTG 


GACTGCTTCT 


GGGAAGGGGC 


AAAGCTACAG 


780 


TCCGGGACAG 


CATACCTCCT 


AGGTAAGCCT 


CCTTTACGGT 


GGACAAACTT 


TGACCCCTTG 


840 


GAATTCCTAG 


AAGAGTTAAA 


GAAAATAAAC 


TACCAAGTGG 


ACAGCTGGGA 


GGAAATGCTG 


900 


AATAAAGCCG 


AAGTTGGCCA 


TGGGTACATG 


GACCGGCCTT 


GCCTCAACCC 


AGCCGACCCA 


960 


GATTGCCCTG 


CCACAGCCCC 


TAACAAAAAT 


TCAACCAAAC 


CTCTTGATGT 


GGCCCTTGTT 


1020 


TTGAATGGTG 


GATGTCAAGG 


TTTATCCAGG 


AAGTATATGC 


ATTGGCAGGA 


GGAGTTGATT 


1080 


GTGGGTGGTA 


CCGTCAAGAA 


TGCCACTGGA 


AAACTTGTCA 


GCGCTCACGC 


CCTGCAAACC 


1140 


ATGTTCCAGT 


TAATGACTCC 


CAAGCAAATG 


TATGAACACT 


TCAGGGGCTA 


CGACTATGTC 


1200 


7CTCACATCA 
TACGTGGAGG 


ACTGGAATGA 
TGGTTCATCA 


AGACAGGGCA 
AAGTGTCGCC 


GCCGCCATCC 
CCAAACTCCA 


TGGAGGCCTG 
CTCAAAAGGT 


GCAGAGGACT 
GCTTCCCTTC 


1260 
1320 


ACAACCACGA 


CCCTGGACGA 


CATCCTAAAA 


TCCTTCTCTG 


ATGTCAGTGT 


CATCCGAGTG 


1380 


GCCAGCGGCT 


ACCTACTGAT 


GCTTGCCTAT 


GCCTGTTTAA 


CCATGCTGCG 


CTGGGACTGC 


1440 


TCCAAGTCCC 


AGGGTGCCG7 


GGGGCTGGCT 


GGCGTCCTGT 


TGGTTGCGCT 


GTCAGTGGCT 


1500 


GCAGGATTGG 


GCCTCTGCTC 


CTTGATTGGC 


ATTTCTTTTA 


ATGCTGCGAC 


AACTCAGGTT 


1560 


TTGCCGTTTC 


TTGCTCTTGG 


TGTTGGTGTG 


GATGATGTCT 


TCCTCCTGGC 


CCATGCATTC 


1620 


AGTGAAACAG 


GACAGAATAA 


GAGGATTCCA 


TTTGAGGACA 


GGACTGGGGA 


GTGCCTCAAG 


1680 
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CGCACCGGAG CCAGCGTGGC CCTCACCTCC ATCAGCAATG TCACCGCCTT CTTCATGGCC 17 4 0 

GCATTGATCC CTATCCCTGC CCTGCGAGCG TTCTCCCTCC AGGCTGCTGT GGTGGTGGTA 1800 

TTCAATTTTG CTATGGTTCT GCTCATTTTT CCTGCAATTC TCAGCATGGA TTTATACAGA 18 60 

CGTGAGGACA GAAGAT7GGA TATTTTCTGC TGTTTCACAA GCCCCTGTGT CAGCAGGGTG 1920 

ATTCAAGTTG AGCCACAGGC CTACACAGAG CCTCACAGTA ACACCCGGTA CAGCCCCCCA 1980 

CCCCCATACA CCAGCCACAG CTTCGCCCAC GAAACCCATA TCACTATGCA GTCCACCGTT 20 40 

CAGCTCCGCA CAGAGTATGA CCCTCACACG CACGTGTACT ACACCACCGC CGAGCCACGC 2100 

TCTGAGATCT CTGTACAGCC TGTTACCGTC ACCCAGGACA ACCTCAGCTG TCAGAGTCCC 2160 

GAGAGCACCA GCTCTACCAG GGACCTGCTC TCCCAGTTCT CAGACTCCAG CCTCCACTGC 2220 

CTCGAGCCCC CCTGCACCAJ^ GTGGACAGTG TCTTCGTTTG CAGAGAAGCA CTATGCTCCT 22 80 

TTCCTCCTGA AACCCAAAGC CAAGGTTGTG GTAATCCTTC TTTTCCTGGG CTTGCTGGGG 2 34C 

GTCAGCCTTT ATGGGACCAC CCGAGTGAGA GACGGGCTGG ACCTCACGGA CATTGTTCCC 2^0: 

CGGGAAACCA GAGAATATGA CTTCATAGCT GCCCAGTTCA AGTACTTCTC TTTCTACAAC 2 4 60 

ATGTATATAG TCACCCAGAA AGCAGACTAC CCGAATATCC AGCACCTACT TTACGACCTT 2 520 

CATAAGAGTT TCAGCAATGT GAAGTATGTC ATGCTGGAGG AGAACAAGCA ACTTCCCCAA 258 0 

ATGTGGCTGC ACTACTTTAG AGACTGGCTT CAAGGACTTC AGGATGCATT TGACAGTGAC 2 64 0 

TGGGAAACTG GGAGGATCAT GCCAAJVCAAT TATAAAAATG GATCAGATGA CGGGGTCCTC 27 00 

GCTTACAAAC TCCTGGTGCA GACTGGCAGC CGAGACAAGC CCATCGACAT TAGTCAGTTG 27 6 0 

ACTAAACAGC GTCTGGTAGA CGCAGATGGC ATCATTAATC CGAGCGCTTT CTACATCTAC 2820 

CTGACCGCTT GGGTCAGCAA CGACCCTGTA GCTTACGCTG CCTCCCAGGC CAACATCCGG 28 80 

CCTCACCGGC CGGAGTGGGT CCATGACAAA GCCGACTACA TGCCAGAGAC CAGGCTGAGA 29 4 0 

ATCCCAGCAG CAGAGCCCAt" CGAGTACGCT CAGTTCCCTT TCTACCTCAA CGGCCTACGA 3000 

-GACACCTCAG ACTTTGTGGA AGCCATAGAA AAAGTGAGAG TCATCTGTAA CAACTATACG 30 60 

AGCCTGGGAC TGTCCAGCTA CCCCAATGGC TACCCCTTCC TGTTCTGGGA GCAATACATC 312 0 

AGCCTGCGCC ACTGGCTGCT GCTATCCATC AGCGTGGTGC TGGCCTGCAC GTTTCTAGTG 3180 

TGCGCAGTCT TCCTCCTGAA CCCCTGGACG GCCGGGATCA TTGTCATGGT CCTGGCTCTG 32 4 0 

ATGACCGTTG AGCTCTTTGG CATGATGGGC CTCATTGGGA TCAAGCTGAG TGCTGTGCCT 3300 

GTGGTCATCC TGATTGCATC TGTTGGCATC GGAGTGGAGT TCACCGTCCA CGTGGCTTTG 33 60 

GCCTTTCTGA CAGCCATTGG GGACAAGAAC CACAGGGCTA TGCTCGCTCT GG AACACATG 3420 

TTTGCTCCCG TTCTGGACGG TGCTGTGTCC ACTCTGCTGG GTGTACTGAT GCTTGCAGGG 34 80 

TCCGAATTTG ATTTCATTGT CAGATACTTC TTTGCCGTCC TGGCCATTCT CACCGTCTTG 354 0 

GGGGTTCTCA ATGGACTGGT TCTGCTGCCT GTCCTCTTAT CCTTCTTTGG ACCGTGTCCT 3 600 
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GAGGTGTCTC CAGCCAATGG CCTAAACCGA CTGCCCACTC CTTCGCCTGA GCCGCCTCCA 3 660 

AGTGTCGTCC GGTTTGCCGT GCCTCCTGGT CACACGAACA ATGGGTCTGA TTCCTCCGAC 3720 

TCGGAGTACA GCTCTCAGAC CACGGTGTCT GGCATCAGTG AGGAGCTCAG GCAATACGAA 378 0 

GCACAGCAGG GTGCCGGAGG CCCTGCCCAC CAAGTGATTG TGGAAGCCAC AGAAAACCCT 384 0 

GTCTTTGCCC GGTCCACTGT GGTCCATCCG GACTCCAGAC ATCAGCCTCC CTTGACCCCT 3 900 

CGGCAACAGC CCCACCTGGA CTCTGGCTCC TTGTCCCCTG GACGGCAAGG CCAGCAGCCT 39 60 

CGAAGGGATC CCCCTAGAGA AGGCTTGCGG CCACCCCCCT ACAGACCGCG CAGAGACGCT 4 02 0 

TTTGAAATTT CTACTGAAGG GCA7TCTGGC CCTAGCAATA GGGACCGCTC AGGGCCCCGT 4 080 

GGGGCCCGTT CTCACAACCC TCGGAACCCA ACGTCCACCG CCATGGGCAG CTCTGTGCCC 4140 

AGCTACTGCC AGCCCATCAC CACTGTGACG GCTTCTGCTT CGGTGACTGT TGCTGTGCAT 4200 

CCCCCGCCTG GACCTGGGCG CAACCCCCGA GGGGGGCCCT GTCCAGGCTA TGAGAGCTAC 4 2 6C 

CCTGAGACTG ATCACGGGGT ATTTGAGGAT CCTCATGTGC CTTTTCATGT CAGGTG7GAC 4 32C 

AGGAGGGACT CAAAGGTGGA GGTCATAGAG CTACAGGACG TGGAATG7GA GGAGAGGCCG 43 80 

TGGGGGAGCA GCTCCAACTG AGGGTAATTA AAATCTGAAG CAAAGAGGCC AAAGATTGGA 4 4 40 

AAGCCCCGCC CCCACCTCTT TCCAGAACTG CTTGAAGAGA ACTGCTTGGA ATTA7GGGAA 4 500 

GGCAG7TCAT TG77AC7G7A AC7GA77G7A TTATTKKG7G AAA7A777C7 ATAAATA777 4 5 60 

AARAGG7G7A CACA7G7AA7 A7ACATGGAA ATGC7G7ACA G7CTA777CC TGGGGCCTCT 4 620 

CCACTCCTGC CCCAGAGTGG GGAGACCACA GGGGCCC777 CCCCTGTGTA CA77GG7CTC 4 66 0 

TGTGCCACAA CCAAGC77AA C77AG7TTTA AAAAAAA7CT CCCAGCA7AT GTCGCTGC7G 474 0 

CTTAAATA77 G7A7AA77TA C77G7ATAA7 7CTA7GCAAA TATTGCTTAT GTAATAGGA7 4 8 00 

7ATTTG7AAA GG777C7GT7 TAAAA7A7T7 TAAAT7TGCA TATCACAACC C7G7GG7AGG 4 8 60 

ATGAATTGTT AC7G77AAC7 777GAACACG CTATGCGTGG TAAT7G7T7A ACGAGCAGAC 4 920 

ATGAAGAAAA CAGG77AA7C CCAG7GGC77 CTCTAGGGGT AGTTG7ATA7 GGT7CGCATG 4 980 

GG7GGATG7G TGTG7GCA7G 7GAC77TCCA ATGTACTG7A TTGTGG777G TTGTTGTTG7 50 40 

TGCTGTTGT7 G77CA77T7G GTG777TTGG TTGC77TGTA 7GATCT7AGC 7C7GGCC7AG 5100 

GTGGGC7GGG AAGG7CCAGG TC77TTTCTG TCGTGATGCT GG7GGAAAGG TGACCCCAA7 5160 

CA7C7G7CCT AT7CTCTGGG ACTATTC 5187 
(2) INFORMATION FOR SEQ ID NO: 4: 

(il SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1311 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Val Ala Pro Asp Ser Glu Ala Pro Ser Asn Pro Arg He Thr Ala 
^5 10 15 

Ala His Glu Ser Pro Cys Ala Thr Glu Ala Arg His Ser Ala Asp Leu 
20 25 30 

Tyr He Arg Thr Ser Trp Val Asp Ala Ala Leu Ala Leu Ser Glu Leu 
35 40 45 

Glu Lys Gly Asn He Glu Gly Gly Arg Thr Ser Leu Trp He Arg Ala 
50 55 60 

Trp Leu Gin Glu Gin Leu Phe He Leu Gly Cys Phe Leu Gin Gly Asp 
^5 "70 75 80 

Ala Gly Lys Val Leu Phe Val Ala He Leu Val Leu Ser Thr Phe Cys 
85 90 95 

Val Gly Leu Lys Ser Ala Gin He His Thr Arg Val Asp Gin Leu Trp 
100 105 110 

Val Gin Glu Gly Gly Arg Leu Glu Ala Glu Leu Lys Tyr Thr Ala Gin 
115 120 125 

Ala Leu Gly Glu Ala Asp Ser Ser Thr His Gin Leu Val He Gin Thr 
130 135 140 

Ala Lys Asp Pro Asp Val Ser Leu Leu His Pro Gly Ala Leu Leu Glu 

150 155 160 

His Leu Lys Val Val His Ala Ala Thr Arg Val Thr Val His Met Tyr 
165 170 175 

Asp He Glu Trp Arg Leu Lys Asp Leu Cys Tyr Ser Pro Ser He Pro 
180 195 190 

Asp Phe Glu Gly Tyr His His He Glu Ser He He Asp Asn Val He 
195 200 205 

Pro Cys Ala He He Thr Pro Leu Asp Cys Phe Trp Glu Gly Ser Lys 
210 215 220 

Leu Leu Gly Pro Asp Tyr Pro He Tyr Val Pro His Leu Lys His Lys 
225 230 235 240 

Leu Gin Trp Thr His Leu Asn Pro Leu Glu Val Val Glu Glu Val Lys 
245 250 255 

Lys Leu Lys Phe Gin Phe Pro Leu Ser Thr He Glu Ala Tyr Met Lys 
260 265 270 

Arg Ala Gly He Thr Ser Ala Tyr Met Lys Lys Pro Cys Leu Asp Pro 
275 280 285 

Thr Asp Pro His Cys Pro Ala Thr Ala Pro Asn Lys Lys Ser Gly His 
250 295 300 
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He Pro Asp Val Ala Ala Glu Leu Ser His Gly Cys Tyr Gly Phe Ala 
305 310 315 320 

Ala Ala Tyr Met His Trp Pro Glu Gin Leu lie Val Gly Gly Ala Thr 
325 330 335 

Arg Asn Ser Thr Ser Ala Leu Arg Lys Ala Arg Xaa Leu Gin Thr Val 
340 345 350 

Val Gin Leu Met Gly Glu Arg Glu Met Tyr Glu Tyr Trp Ala Asp His 
355 360 365 

Tyr Lys Val His Gin He Gly Trp Asn Gin Glu Lys Ala Ala Ala Val 
370 375 380 

Leu Asp Ala Trp Gin Arg Lys Phe Ala Ala Glu Val Arg Lys He Thr 
385 390 395 400 

Thr Ser Gly Ser Val Ser Ser Ala Tyr Ser Phe Tyr Pro Phe Ser Thr 
405 410 415 

Ser Thr Leu Asn Asp He Leu Gly Lys Phe Ser Glu Val Ser Leu Lys 
420 425 430 

Asn He He Leu Gly Tyr Met Phe Met Leu He Tyr Val Ala Val Thr 
435 440 445 

Leu He Gin Trp Arg Asp Pro He Arg Ser Gin Ala Gly Val Gly He 
450 455 460 

Ala Gly Val Leu Leu Leu Ser He Thr Val Ala Ala Gly Leu Gly Phe 
^65 470 475 480 

Cys Ala Leu Leu Gly He Pro Phe Asn Ala Ser Ser Thr Gin He Val 
485 490 495 

Pro Phe Leu Ala Leu Gly Leu Gly Val Gin Asp Met Phe Leu Leu Thr 
500 505 510 

His Thr Tyr Val Glu Gin Ala Gly Asp Val Pro Arg Glu Glu Arg Thr 
515 520 525 

Gly Leu Val Leu Lys Lys Ser Gly Leu Ser Val Leu Leu Ala Ser Leu 
530 535 540 



Cys Asn Val Met Ala Phe Leu Ala Ala Ala Leu Leu Pro He Pro Ala 

550 555 

Phe Arg Val Phe Cys Leu Gin Ala Ala He Leu Leu Leu Phe Asn Leu 
565 570 575 

Gly Ser He Leu Leu Val Phe Pro Ala Met He Ser Leu Asp Leu Arg 
580 585 590 

Arg Arg Ser Ala Ala Arg Ala Asp Leu Leu Cys Cys Leu Met Pro Glu 
595 600 605 

Ser Pro Leu Pro Lys Lys Lys He Pro Glu Arg Ala Lys Thr Arg Lys 
610 615 620 

Asn Asp Lys Thr His Arg He Asp Thr Thr Arg Gin Pro Leu Asp Pro 

630 635 640 
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Asp Val set Glu Asn Val Thr Lys Thr Cys Cys Leu Ser Val Ser Leu 
€45 650 655 

Thr Lys Trp Ala Lys Asn Gin Tyr Ala Pro Phe lie Met Arg Pro Ala 
660 665 670 

Val Lys Val Thr Ser Met Leu Ala Leu lie Ala Val He Leu Thr Se- 
675 680 685 

Val Trp Gly Ala Thr Lys Val Lys Asp Gly Leu Asp Leu Thr Asp lie 
690 695 700 

Val Pro Glu Asn Thr Asp Glu His Glu Phe Leu Ser Arg Gin Glu Lvs 
■'OS 710 715 720 



Tyr Phe Gly Phe Tyr Asn Met Tyr Ala Val Thr Gin Gly Asn Phe Glu 
725 730 735 

Tyr Pro Thr Asn Gin Lys Leu Leu Tyr Glu Tyr His Asp Gin Phe Val 
740 745 750 

Arg He Pro Asn lie He Lys Asn Asp Asn Gly Gly Leu Thr Lys Phe 
755 760 765 

Trp Leu Ser Leu Phe Arg Asp Trp Leu Leu Asp Leu Gin Val Ala Phe 
770 775 780 

Asp Lys Glu Val Ala Ser Gly Cys lie Thr Gin Glu Tyr Trp Cys Lvs 

790 795 y ^ y 

Asn Ala Ser Asp Glu Gly He Leu Ala Tyr Lys Leu Met Val Gin Thr 
805 810 815 

Gly His Val Asp Asn Pro He Asp Lys Ser Leu He Thr Ala Gly Hi* 
820 825 830 

Arg Leu Val Asp Lys Asp Gly He He Asn Pro Lys Ala Phe Tyr Asn 
835 840 845 

Tyr Leu Ser Ala Trp Ala Thr Asn Asp Ala Leu Ala Tyr Gly Ala Ser 
850 855 660 

Gin Gly Asn Leu Lys Pro Gin Pro Gin Arg Trp He His Ser Pro Glu 

870 875 880 

Asp Val His Leu Glu He Lys Lys Ser Ser Pro Leu He Tyr Thr Gin 
885 890 895 

Leu Pro Phe Tyr Leu Ser Gly Leu Ser Asp Thr Xaa Ser He Lys Thr 
900 905 910 

Leu He Arg Ser Val Arg Asp Leu Cys Leu Lys Tyr Glu Ala Lys Glv 
915 920 925 

Leu Pro Asn Phe Pro Ser Gly He Pro Phe Leu Phe Trp Glu Gin Tvr 
930 935 940 

Leu Tyr Leu Arg Thr Ser Leu Leu Leu Ala Leu Ala Cys Ala Leu Ala 

950 955 960 

Ala Val Phe He Ala Val Met Val Leu Leu Leu Asn Ala Trp Ala Ala 
965 970 975 
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Vai Leu Vai Thr Leu Ala Leu Ala Thr Leu Val Leu Gin Leu Leu Gly 
980 985 990 

Val Met Ala Leu Leu Gly Val Lys Leu Ser Ala Met Pro Ala Val Leu 
995 1000 1005 

Leu Val Leu Ala lie Gly Arg Gly Val His Phe Thr Val His Leu Cys 
1010 1015 1020 



Leu Gly Phe Val Thr Ser lie Gly Cys Lys Arg Arg Arg Ala Ser Leu 
1025 1030 1035 1040 



Ala Leu Glu Ser Val Leu Ala Pro Val Val His Gly Ala Leu Ala Ala 

1045 1050 1055 

Ala Leu Ala Ala Ser Met Leu Ala Ala Ser Glu Cys Gly Phe Val Ala 
1060 1065 1070 

Arg Leu Phe Leu Arg Leu Leu Leu Asp lie Val Phe Leu Gly Leu lie 
1075 1080 1085 

Asp Gly Leu Leu Phe Phe Pro He Val Leu Ser Tie Leu Gly Pro Ala 
1090 1095 UOO 

Ala Glu Val Arg Pro He Glu Kis Pro Glu Arg Leu Ser Thr Pro Ser 
1105 1110 1115 1120 



Pre Lys Cys Ser Pro He His Pro Arg Lys Ser Ser Ser Ser Ser Gly 
1125 1130 1135 

Gly Gly Asp Lys Ser Ser Arg Thr Ser Lys Ser Ala Pro Arg Pro Cys 
1140 1145 1150 

Ala Pro Ser Leu Thr Thr lie Thr Glu Glu Pro Ser Ser Trp His Ser 
1155 1160 1165 

Ser Ma His Ser Val Glr. Ser Ser >!et Gin Ser He Val Val Gin Pre 
1170 1175 1180 

Glu Val Val Val Glu Thr Thr Thr Tyr Asn Gly Ser Asp Ser Ala Ser 
1185 1190 1195 1200 

Gly Arg Ser Thr Pro Thr Lys Ser Ser His Gly Gly Ala He Thr Thr 
1205 1210 1215 

Thr Lys Val Thr Ala Thr Ala Asn He Lys Val Glu Val Val Thr Pro 
1220 1225 1230 

Ser Asp Arg Lys Ser Arg Arg Ser Tyr His Tyr Tyr Asp Arg Arg Arg 
1235 1240 1245 

Asp Arg Asp Glu Asp Arg Asp Arg Asp Arg Glu Arg Asp Arg Asp Arg 
1250 1255 1260 

Asp Arg Asp Arg Asp Arg Asp Arg Asp Arg Asp Arg Asp Arg Asp Arg 
1265 1270 1275 1280 

Glu Arg Ser Arg Glu Arg Asp Arg Arg Asp Arg Tyr Arg Asp Glu Arg 
1285 1290 1295 

Asp Kis Arg Ala Ser Pro Arg Glu Lys Arg Gin Arg Phe Trp Thr 
130C 1305 1310 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4434 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



CGAAACAAGA 


GAGCGAGTGA 


GAGTAGGGAG 


AGCGTCTGTG 


TTGTGTGTTG 


AGTGTCGCCC 


60 


ACGCACACAG 


GCGCAAAACA 


GTGCACACAG 


ACGCCCGCTG 


GGCAAGAGAG 


AGTGAGAGAG 


120 


AGAAACAGCG 


GCGCGCGCTC 


GCCTAATGAA 


GTTGTTGGCC 


TGGCTGGCGT 


GCCGCATCCA 


180 


CGAGATACAG 


ATACATCTCT 


CATGGACCGC 


GACAGCCTCC 


CACGCGTTCC 


GGACACACAC 


240 


GGCoATGTGG 


TCGATGAGAA 


ATTATTCTCG 


GATCTTTACA 


TACGCACCAG 


CTGGGTGGAC 


300 


. • . - _ AAGTGu 


CGCTCGATCA 


GATAGATAAG 


GGC;^^J^GCGC 


GTGGCAGCCG 


CACGGCGATC 


3 C Z 


TATCTGCGAT 


CAGTATTCCA 


GTCCCACCTC 


GAAACCCTCG 


GCAGCTCCGT 


GCAAAAGCAC 


420 


r> ri^o 
(j Ltj o U c AA G G 


TGCTATTCGT 


GGCTATCCTG 


GTGCTGAGCA 


CCTTCTGCGT 


CGGCCTGAAG 


480 




i UuACTCCAA 


GGTGCACCAG 


CTGTGGATCC 


AGGAGGGCGG 


CCGGCTGGAG 


540 




L-U i AuACACA 


GAAGACGATC 


GGCGAGGACG 


AGTCGGCCAC 


GCATCAGCTG 


600 




v-oACL-wAuGA 


CCCGAACGCC 


TCCGTCCTGC 


ATCCGCAGGC 


GCTGCTTGCC 


660 








GTCAAGGTGC 


ACCTCTACGA 


CACCGAATGG 


720 


GGGCTGCGCG 


ACATGTGCAA 


CATGCCGAGC 


ACGCCCTCCT 


TCGAGGGCAT 


CTACTACATC 


780 


GAGCAGATCC 


TGCGCCACCT 


CATTCCGTGC 


TCGATCATCA 


CGCCGCTGGA 


CTGTTTCTGG 


840 


GAGGGAAGCC 


AGCTGTTGGG 


TCCGGAATCA 


GCGGTCGTTA 


TACCAGGCCT 


CAACCAACGA 


900 


CTCCTGTGGA 


CCACCCTGAA 


TCCCGCCTCT 


GTGATGCAGT 


ATATGAAACA 


AAAGATGTCC 


960 


GAGGAAAAGA 


TCAGCTTCGA 


CTTCGAGACC 


GTGGAGCAGT 


ACATGAAGCG 


TGCGGCCATT 


1020 


GGCAGTGGCT 


ACATGGAGAA 


GCCCTGCCTG 


AACCCACTGA 


ATCCCAATTG 


CCCGGACACG 


lOSO 


GCACCGAACA 


AGAACAGCAC 


CCAGCCGCCG 


GATGTGGGAG 


CCATCCTGTC 


CGGAGGCTGC 


IMO 


TACGGTTATG 


CCGCGAAGCA 


CATGCACTGG 


CCGGAGGAGC 


TGATTGTGGG 


CGGACGGAAG 


1200 


AGGAACCGCA 


GCGGACACTT 


GAGGAAGGCC 


CAGGCCCTGC 


AGTCGGTGGT 


GCAGCTGATG 


1260 


ACCGAGAAGG 


AAATGTACGA 


CCAGTGGCAG 


GACAACTACA 


AGGTGCACCA 


TCTTGGATGG 


1320 


ACGCAGGAGA 


AGGCAGCGGA 


GGTTTTGAAC 


GCCTGGCAGC 


GCAACTTTTC 


GCGGGAGGTG 


1380 


GAACAGCTGC 


TACGTAAACA 


GTCGAGAATT 


GCCACCAACT 


ACGATATCTA 


CGTGTTCAGC 


1440 
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TCGGCTGCAC TGGATGACAT CCTGGCCAAG TTCTCCCATC CCAGCGCCTT GTCCATTGTC 1500 
ATCGGCGTGG CCGTCACCGT TTTGTATGCC TTTTGCACGC TCCTCCGCTG GAGGGACCCC 1560 
GTCCGIGGCC AGAGCAGTGT GGGCGTGGCC GGAGTTCTGC TCATGTGCTT CAGTACCGCC 1620 
GCCGGATTGG GATTGTCAGC CCTGCTCGGT ATCGTTTTCA ATGCGCTGAC CGCTGCCTAT 1680 
GCGGAGAGCA ATCGGCGGGA GCAGACCAAG CTGATTCTCA AGAACGCCAG CACCCAGGTG 17 4 0 
GTTCCGTTTT TGGCCCTTGG TCTGGGCGTC GATCACATCT TCATAGTGGG ACCGAGCATC 18 00 
CTGTTCAGTG CCTGCAGCAC CGCAGGATCC TTCTTTGCGG CCGCCTTTAT TCCGGTGCCG 18 60 
GCTTTGAAGG TATTCTGTCT GCAGGCTGCC ATCGTAATGT GCTCCAATTT GGCAGCGGCT 1920 
CTATTGGTTT TTCCGGCCAT GATTTCGTTG GATCTACGGA GACGTACCGC CGGCAGGGC6 1980 
GACATCTTCT GCTGCTGTTT 7CCGGTGTGG AAGGAACAGC CGAAGGTGGC ACCTCCGGTG 204 0 
CTGCCGCTGA ACAACAACAA CGGGCGCGGG GCCCGGCATC CGAAGAGCTG CAACAACAAC 2100 
AGGGTGCCGC TGCCCGCCCA GAATCCTCTG CTGGAACAGA GGGCAGACAT CCCTGGGAGC 2160 
AGTCACTCAC TGGCGTCCTT CTCCCTGGCA ACCTTCGCCT TTCAGCACTA CACTCCCTTC 2 22 0 

CTCATGCGCA GCTGGGTGAA GTTCCTGACC GTTATGGGTT TCCTGGCGGC CCTCATATCC 22 80 
AGCTTGTATG CCTCCACGCG CCTTCAGGRT GGCCTGGACA TTATTGATCT GGTGCCCAAG 23 4 0 
GACAGCAACG AGCACAAGTT CCTGGATGCT CAAACTCGGC TCTTTGGCTT CTACAGCATG 2 4 00 

TATGCGGTTA CCCAGGGCAA CTTTGAATAT CCCACCCAGC AGCAGTTGCT CAGGGACTAC 24 60 
CATGATTCCT TTGTGCGGG7 GCCACATGTG ATCAAGAATG ATAACGGTGG ACTGCCGGAC 2520 
TTC7GGCTGC TGCTCTTCAG CGAGTGGCTG GGTAATCTGC AAAAGATATT CGACGAGGAA 2£cC 
TACCGCGACG GACGGCTGAC CAAGGAGTGC TGGTTCCCAA ACGCCAGCAG CGATGCCATC 264 0 

CTGGCCTACA AGC7AATCGT GCAAACCGGC CATGTGGACA ACCCCGTGGA CAAGGAACTG 27 00 

GTGCTCACCA ATCGCCTGGT CAACAGCGAT GGCATCATCA ACCAACGCGC CTTCTACAAC 27 60 

TATCTGTCGG CATGGGCCAC CAACGACGTC TTCGCCTACG GAGCTTCTCA GGGCAAATTG 282 0 

TATCCGGAAC CGCGCCAGTA TTTTCACCAA CCCAACGAGT ACGATCTTAA GATACCCAAG 28 8 0 

AGTC7GCCAT TGGTCTACGC TCAGATGCCC TTTTACCTCC ACGGACTAAC AGATACCTCG 29 4 0 

CAGATCAAGA CCCTGATAGG TCATATTCGC GACCTGAGCG TCAAGTACGA GGGCTTCGGC 30 0 0 

CTGCCCAACT ATCCATCGGG CATTCCCTTC ATCTTCTGGG AGCAGTACAT GACCCTGCGC 30 60 

TCCTCACTGG CCATGATCCT . GGCCTGCGTG CTACTCGCCG CCCTGGTGCT GGTCTCCCTG 3120 

CTCCTGCTCT CCGTTTGGGC CGCCGTTCTC GTGATCCTCA GCGTTCTGGC CTCGCTGGCC 318 0 

CAGATCTTTG GGGCCATGAC TCTGCTGGGC ATCAAACTCT CGGCCATTCC GGCAGTCATA 324 0 

CTCATCCTCA GCGTGGGCAT GATGCTGTGC TTCAATGTGC TGATATCACT GGGCTTCATG 3300 

ACATCCGTTG GCAACCGACA GCGCCGCGTC CAGCTGAGCA TGCAGATGTC CCTGGGACCA 336C 
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CTTGTCCACG GCATGCTGAC CTCCGGAGTG GCCGTGTTCA TGCTCTCCAC GTCGCCCTTT 3420 

GAGTTTGTGA TCCGGCACTT CTGCTGGCTT CTGCTGGTGG TCTTATGCGT 7GGCGCCTGC 34 8 0 

AACAGCCTTT TGGTGTTCCC CATCCTACTG AGCATGGTGG GACCGGAGGC GGAGCTGGTG 35 4 0 

CCGCTGGAGC ATCCAGACCG CATATCCACG CCCTCTCCGC TGCCCGTGCG CAGCAGCAAG 3600 

AGATCGGGCA AATCCTATGT GGTGCAGGGA TCGCGATCCT CGCGAGGCAG CTGCCAGAAG 36 60 

TCGCATCACC ACCACCACAA AGACCTTAAT GATCCATCGC TGACGACGAT CACCGAGGAG 3720 

CCGCAGTCGT GGAAGTCCAG CAACTCGTCC ATCCAGATGC CCAATGATTG GACCTACCAG 37 8 0 

CCGCGGGAAC AGCGACCCGC CTCCTACGCG GCCCCGCCCC CCGCC7ATCA CAAGGCCGCC 384 0 

GCCCAGCAGC ACCACCAGCA TCAGGGCCCG CCCACAACGC CCCCGCCTCC CTTCCCGACG 3900 

GCCTATCCGC CGGAGCTGCA GAGCATCGTG GTGCAGCCGG AGGTGACGGT GGAGACGACG 3 9 60 

CACTCGGACA GCAACACCAC CAAGGTGACG GCCACGGCCA ACATCAAGGT GGAGCTGGCC 4 02 0 

ATGCCCGGCA GGGCGGTGCG CAGCTATAAC TTTACGAGTT AGCACTAGCA CTAG7TCCTG 4 08C 

TAGCTATTAG GACGTATCTT TAGACTCTAG CCTAAGCCGT AACCCTATTT GTATCTGTAA 4140 

AATCGATTTG TCCAGCGGGT CTGCTGAGGA TTTCGTTCTC ATGGATTCTC ATGGATTCTC 42 0 0 

ATGGATGCTT AAATGGCATG GTAATTGGCA AAATATCAAT TTTTGIGTCT CAAAAAGATG 42 60 

CATTAGCTTA TGGTTTCAAG ATACATTTTT AAAGAGTCCG CCAGATATTT ATATAAAAAA 4 32C 

AATCCAAAAT CGACGTATCC ATGAAAATTG AAAAGCTAAG CAGACCCGTA TGTATGTATA 4 38 0 

TGTGTATGCA TGTTAGTTAA TTTCCCGAAG TCCGGTATTT ATAGCAGCTG CCTT 4 4 34 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1285 amino acids 

(B) TYPE; amino acid 

CO STRANDEDNESS : single 
(D) TOPOLOGy: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Asp Arg Asp Ser Leu Pro Arg Val Pro Asp Thr His Gly Asp Val 

15 10 15 

Val Asp Glu Lys Leu Phe Ser Asp Leu Tyr lie Arg Thr Ser Trp Val 

20 25 30 

Asp Ala Gin Val Ala Leu Asp Gin lie Asp Lys Gly Lys Ala Arg Gly 

35 40 45 



Ser Arg Thr Ala lie Tyr Leu Arg Ser Val Phe Gin Ser His Leu Glu 
50 55 60 
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Thr Leu Gly Ser Ser Val Gin Lys His Ala Gly Lys Val Leu Phe Val 
65 70 75 80 

Aia lie Leu Val Leu Ser Thr Phe Cys Val Gly Leu Lys Ser Ala Gin 
85 90 95 

He His Ser Lys Val His Gin Leu Trp He Gin Glu Gly Gly Arg Leu 
100 105 110 

Glu Ala Glu Leu Ala Tyr Thr Gin Lys Thr He Gly Glu Asp Glu Ser 
115 120 125 

Ala Thr His Gin Leu Leu He Gin Thr Thr His Asp Pro Asn Ala Ser 
130 135 140 

Val Leu His Pro Gin Ala Leu Leu Ala His Leu Glu Val Leu Val Lys 
145 ISO 155 160 

Ala Thr Ala Val Lys Val His Leu Tyr Asp Thr Glu Trp Gly Leu Arg 
165 170 175 

Asp Met Cys Asn Met Pro Ser Thr Pro Ser . Phe Glu Gly He Tyr Tyr 
180 185 190 

He Glu Gin He Leu Arg His Leu He Pro Cys Ser He He Thr Pro 
195 200 205 

Leu Asp Cys Phe Trp Glu Gly Ser Gin Leu Leu Gly Pro Glu Ser Ala 
210 215 220 

Val Val He Pro Gly Leu Asn Gin Arg Leu Leu Trp Thr Thr Leu Asn 
225 230 235 240 

Pro Ala Ser Val Met Gin Tyr Met Lys Gin Lys Met Ser Glu Glu Lys 
245 250 255 

He Ser Phe Asp Phe Glu Thr Val Glu Gin Tyr Met Lys Arg Ala Ala 
260 265 270 

He Gly Ser Gly Tyr Met Glu Lys Pro Cys Leu Asn Pro Leu Asn Pro 
275 280 285 

Asn Cys Pro Asp Thr Ala Pro Asn Lys Asn Ser Thr Gin Pro Pro Asp 
290 295 300 

Val Gly Ala He Leu Ser Qly Gly Cys Tyr Gly Tyr Ala Ala Lys His 
305 310 315 320 

Met His Trp Pro Glu Glu Leu He Val Gly Gly Arg Lys Arg Asn Arg 
325 330 335 

Ser Gly His Leu Arg Lys Ala Gin Ala Leu Gin Ser Val Val Gin Leu 
340 345 350 

Met Thr Glu Lys Glu Met Tyr Asp Gin Trp Gin Asp Asn Tyr Lys Val 
355 360 365 

His His Leu Gly Trp Thr Gin Glu Lys Aia Aia Glu Val Leu Asn Ala 
370 375 380 



Trp Gin Arg Asn Phe Ser Arg Glu Val Glu Gin Leu Leu Arg Lys Gin 
385 390 395 400 
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Ser Arg He Ala Thr Asn Tyr Asp He Tyr Val Phe Ser Ser Ala Ala 
405 410 

Leu Asp Asp lie Leu Ala Lys Phe Ser His Pro Ser Aia Leu Ser lie 
420 425 430 

Val He Gly Val Ala Val Thr Val Leu Tyr Ala Phe Cys Thr Leu Leu 
435 440 445 

Arg Trp Arg Asp Pro Val Arg Gly Gin Ser Ser Val Gly Val Ala Gly 

455 460 

Val Leu Leu Met Cys Phe Ser Thr Ala Ala Gly Leu Gly Leu Ser Ala 
4€5 470 475 480 

Leu Leu Gly He Val Phe Asn Ala Leu Thr Ala Ala Tyr Ala Glu Ser 
485 490 495 



/i.y o-iu uxn jnr Lys Leu He Leu Lys Asn Aia Ser Thr Gin 

505 SIO 

Val Val Pro Phe Leu Ala Leu Gly Leu Gly Val Asp His He Phe He 
515 520 525 

Val Gly Pro Ser He Leu Phe Ser Ala Cys Ser Thr Ala Gly Ser Phe 
530 535 540 

Phe Ala Ala Ala Phe He Pro Val Pro Ala Leu Lys Val Phe Cys Leu 
^45 550 555 560 

Gin Ala Ala He Val Met Cys Ser Asn Leu Ala Ala Ala Leu Leu Val 
565 570 575 

Phe Pro Ala Met He Ser Leu Asp Leu Arg Arg Arg Thr Ala Gly Arq 
580 585 590 

Ala Asp He Phe Cys Cys Cys Phe Pro Val Trp Lys Glu Gin Pro Lys 
595 600 605 

Val Ala Pro Pro Val Leu Pro Leu Asn Asn Asn Asn Gly Arg Gly Ala 
610 615 620 

Arg His Pro Lys Ser Cys Asn Asn Asn Arg Val Pro Leu Pro Ala Gin 

630 635 640 

Asn Pro Leu Leu Glu Gin Arg Ala Asp He Pro Gly Ser Ser His Ser 
645 650 655 

Leu Ala Ser Phe Ser Leu Ala Thr Phe Ala Phe Gin His Tyr Thr Pro 
660 665 670 

Phe Leu Met Arg Ser Trp Val Lys Phe Leu Thr Val Met Gly Phe Leu 

^75 680 685 

Ala Ala Leu He Ser Ser Leu Tyr Ala Ser Thr Arg Leu Gin Asp Gly 
690 695 700 

Leu Asp He He Asp Leu Val Pro Lys Asp Ser Asn Glu His Lys Phe 

710 715 720 

Leu Asp Ala Gin Thr Arg Leu Phe Gly Phe Tyr Ser Met Tyr Ala Val 
''^S 730 735 

Thr Gin Gly Asn Phe Glu Tyr Pro Thr Gin Gin Gin Leu Leu Arg Asp 
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740 745 750 

Tyr His Asp Ser Phe Arg Val Pro His Val He Lys Asn Asp Asn Gly 
755 760 765 

Gly Leu Pro Asp Phe Trp Leu Leu Leu Phe Ser Glu Trp Leu Gly Asn 
770 775 780 

Leu Gin Lys He Phe Asp Glu Glu Tyr Arg Asp Gly Arg Leu Thr Lys 
785 790 795 800 

Glu Cys Trp Phe Pro Asn Ala Ser Ser Asp Ala He Leu Ala Tyr Lys 
805 810 815 

Leu He Val Gin Thr Gly His Val Asp Asn Pro Val Asp Lys Glu Leu 
820 825 830 

Val Leu Thr Asn Arg Leu Val Asn Ser Asp Gly He He Asn Gin Arg 
835 640 845 

Aia Phe Tyr Asn Tyr Leu Ser Ala Trp Ala Thr Asn Asp Val Phe Ala 
850 855 860 

Tyr Gly Ala Ser Gin Gly Lys Leu Tyr Pro Glu Pro Arg Gin Tyr Phe 
865 870 875 880 

His Gin Pro Asn Glu Tyr Asp Leu Lys He Pro Lys Ser Leu Pro Leu 
885 890 895 

Val Tyr Ala Gin Met Pro Phe Tyr Leu His Gly Leu Thr Asp Thr Ser 
900 905 910 

Gin He Lys Thr Leu lie Gly His He Arg Asp Leu Ser Val Lys Tyr 
915 920 925 

Glu Gly Phe Gly Leu Pro Asn Tyr Pro Ser Gly He Pro Phe He Phe 
930 935 940 

Trp Glu Gin Tyr Met Thr Leu Arg Ser Ser Leu Ala Met He Leu Ala 

950 955 960 

Cys Val Leu Leu Ala Aia Leu Val Leu Val Ser Leu Leu Leu Leu Ser 
965 970 975 

Val Trp Ala Ala Val Leu Val He Leu Ser Val Leu Ala Ser Leu Ala 
980 985 990 

Gin He Phe Gly Ala Met Thr Leu Leu Gly He Lys Leu Ser Ala He 
995 1000 1005 

Pro Ala Val He Leu He Leu Ser Val Gly Met Met Leu Cys Phe Asn 
iOlO 1015 1020 

Val Leu He Ser Leu Gly Phe Met Thr Ser Val Gly Asn Arg Gin Arg 
^025 1030 1035 1040 

Arg Val Gin Leu Ser Met Gin Met Ser Leu Gly Pro Leu Val His Gly 
1045 1050 1055 

Met Leu Thr Ser Gly Val Ala Val Phe Met Leu Ser Thr Ser Pro Phe 
1060 1065 1070 

Glu Phe Val He Arg His Phe Cys Trp Leu Leu Leu Val Val Leu Cys 
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1075 1080 1085 

Val Gly Ala Cys Asn Ser Leu Leu Val Phe Pro He Leu Leu Ser Met 
1090 1095 1100 

Val Gly Pro Glu Ala Glu Leu Val Pro Leu Glu His Pro Asp Arg lie 
1105 1110 1115 1120 

Ser Thr Pro Ser Pro Leu Pro Val Arg Ser Ser Lys Arg Ser Gly Lys 
1125 1130 1135 

Ser Tyr Val Val Gin Gly Ser Arg Ser Ser Arg Gly Ser Cys Gin Lys 
1140 1145 1150 

Ser Kis His His His His Lys Asp Leu Asn Asp Pro Ser Leu Thr Thr 
1155 1160 1165 

He Thr Glu Glu Pro Gin Ser Trp Lys Ser Ser Asn Ser Ser lie Gin 
1170 1175 iieo 

Met Pro Asn Asp Trp Thr Tyr Gin Pro Ara Glu Gin Arg Pro Ala Ser 
1185 1190 " 1195 120C 

Tyr Ala Ala Pro Pro Pro Ala Tyr His Lys Ala Ala Ala Gin Gin His 
1205 1210 1215 

His Gin His Gin Gly Pro Pro Thr Thr Pro Pro Pro Pro Phe Pro Thr 
1220 1225 1230 

Ala Tyr Pro Pro Glu Lei: Gin Ser He Val Val Gin Pro Glu Val Thr 
123S 1240 1245 

Val Glu Thr Thr His Ser Asp Ser Asn Thr Thr Lys Val Thr Ala Thr 
1250 1255 1260 

Ala Asn He Lys Val Glu Leu Ala Met Pro Gly Arg Ala Val Arg Ser 
1265 1270 1275 1280 

Tyr Asn Phe Thr Ser 
1285 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 345 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

AAvJGTCCATC AGCTTTGGAT ACAGGAAGGT GGTTCGCTCG AGCATGAGCT AGCCTACACG 6C 

CAGAAATCGC TCGGCGAGAT GGACTCCTCC ACGCACCAGC TGCTAATCCA AACNCCCAAA 120 

GATATGGACG CCTCGATACT GCACCCGAAC GCGCTACTGA CGCACCTGGA CGTGGTGAAG ISO 

AAAGCGATCT CGGTGACGGT GCACATGTAC GACATCACGT GGAGNCTCAA GGACATGTGC 2 40 
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TACTCGCCCA GCATACCGAG NTTCGATACG CACTTTATCG AGCAGATCTT CGAGAACATC 3 00 

ATACCGTGCG CGATCATCAC GCCGCTGGAT TGCTTTTGGG AGGGA 34 5 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(AJ LENGTH: 115 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Lys Vai His Gin Leu Trp lie Gin Glu Gly Gly Ser Leu Glu His Glu 
^ 10 15 

Leu Ala Tyr Thr Gin Lys Ser Leu Gly Glu Met Asp Ser Ser Thr His 
20 25 30 

Gin Leu Leu lie Gin Thr Pro Lys Asp Met Asp Ala Ser He Leu His 
35 40 45 

Pro Asn Ala Leu Leu Thr His Leu Asp Val Val Lys Lys Ala He Ser 
50 55 60 

val Thr Val His Met Tyr Asp He Thr Trp Xaa Leu Lys Asp Met Cys 

'^0 75 80 

Tyr Ser Pro Ser He Pro Xaa Phe Asp Thr His Phe He Glu Gin He 
85 90 95 

Phe Glu Asn He He Pro Cys Ala He He Thr Pro Leu Asp Cys Phe 
100 105 110 

Trp Glu Gly 
115 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5187 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: 
CWTCTGTCA CCCGGAGCCG GAGTCCCCGG CGGCCAGCAG CGTCCTCGCG AGCCGAGCGC 6C 
CC-AGGCGCGC CCGGAGCCCG CGGCGGCGGC GGCAACATGG CCTCGGCTGG TAACGCCGCC 12C 
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GGGGCCCTGG GCAGGCAGGC CGGCGGCGGG AGGCGCAGAC GGACCGGGGG ACCGCACCGC 180 
GCCGCGCCGG ACCGGGACTA TCTGCACCGG CCCAGCTACT GCGACGCCGC CTTCGCTCTG 24 0 

GAGCAGATTT CCAAGGGGAA GGCTACTGGC CGGAAAGCGC CGCTGTGGCT GAGAGCGAAG 300 
TT7CAGAGAC TCTTATTTAA ACTGGGTTGT TACATTCAAA AGAACTGCGG CAAGTTTTTG 3 6C 

GTTGTGGGTC TCCTCATATT TGGGGCCTTC GCTGTGGGAT TAAAGGCAGC TAATCTCGAG 420 
ACCAACGTGG AGGAGCTGTG GGTGGAAGTT GGTGGACGAG TGAGTCGAGA ATTAAATTAT ABO 
ACCCGTCAGA AGATAGGAGA AGAGGCTATG TTTAATCCTC AACTCATGAT ACAGACTCCA 5 40 

AAAGAAGAAG GCGCTAATGT TCTGACCACA GAGGCTCTCC TGCAACACCT GGACTCAGCA 600 
CTCCAGGCCA GTCGTGTGCA CGTCTACATG TATAACAGGC AATGGAAGTT GGAACATTTG 6 60 

TGCTACAAAT CAGGGGAACT TATCACGGAG ACAGGTTACA TGGATCAGAT AATAGAATAC 72 0 

CTTTACCCTT GCTTAATCAT TACACCTTTG GACTGCTTCT GGGAAGGGGC AAAGCTACAG 78 j 

T.::rGGGACAG CATACCTCCT AGGTAAGCCT CCTTTACGGT GGACAAACTT TGACCCCTTG 5^C 
G/vATTCCTAG AAGAGTTAAA GAAAATAAAC TACCAAGTGG ACAGCTGGGA GGAAATGCTG 90 0 

AATAAAGCCG AAGTTGGCCA TGGGTACATG GACCGGCCTT GCCTCAACCC AGCCGACCCA 9 60 

GATTGCCCTG CCACAGCCCC TAACAAAAAT TCAACCAAAC CTCTTGATGT GGCCCTTGTT 102 0 

TTGAATGG7G GATGTCAAGG TTTATCCAGG AAGTATATGC ATTGGCAGGA GGAGTTGATT 1080 

GTGGGTGGTA CCGTCAAGAA TGCCACTGGA AAACTTGTCA GCGCTCACGC CCTGCAAACC 1140 

ATGTTCCAGT TAATGACTCC CAAGCAAATG TATGAACACT TCAGGGGCTA CGACTATGTC 120 0 

TCTCACATCA ACTGGAATGA AGACAGGGCA GCCGCCATCC TGGAGGCCTG GCAGAGGACT 12 60 

TACGTGGAGG TGGTTCATCA AAGTGTCGCC CCAAACTCCA CTCAAAAGGT GCTTCCCTTC 132 0 

ACAACCACGA CCCTGGACGA CATCCTAAAA TCCTTCTCTG ATGTCAGTGT CATCCGAGTG 138 0 

GCCAGCGGCT ACCTACTGAT GCTTGCCTAT GCCTGTTTAA CCATGCTGCG CTGGGACTGC 144 0 

TCCAAGTCCC AGGGTGCCGT GGGGCTGGCT GGCGTCCTGT TGGTTGCGCT GTCAGTGGCT 1500 

GCAGGATTGG GCCTCTGCTC CTTGATTGGC ATTTCTTTTA ATGCTGCGAC AACTCAGGTT 1560 

TTGCCGTTTC TTGCTCTTGG TGTTGGTGTG GATGATGTCT TCCTCCTGGC CCATGCATTC 162 0 

AGTGAAACAG GACAGAATAA GAGGATTCCA TTTGAGGACA GGACTGGGGA GTGCCTCAAG 168 0 

CGCACCGGAG CCAGCGTGGC CCTCACCTCC ATCAGCAATG TCACCGCCTT CTTCATGGCC 174 0 

GCATTGATCC CTATCCCTGC CCTGCGAGCG TTCTCCCTCC AGGCTGCTGT GGTGGTGGTA 1800 

TTCAATTTTG CTATGGTTCT GCTCATTTTT CCTGCAATTC TCAGCATGGA TTTATACAGA 18 60 

CGTGAGGACA GAAGATTGGA TATTTTCTGC TGTTTCACAA GCCCCTGTGT CAGCAGGGTG 192 0 

ATTCAAGTTG AGCCACAGGC CTACACAGAG CCTCACAGTA ACACCCGGTA CAGCCCCCCA 198 0 

CCCCCATACA CCAGCCACAG CTTCGCCCAC GAAACCCATA TCACTATGCA GTCCACCGTT 2 04 0 
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CAGCTCCGCA 


CAGAG7ATGA 


CCCTCACACG 


CACGTGTACT 


ACACCACCGC 


CGAGCCACGC 


2100 


TCTGAGATCT 


CTGTACAGCC 


TGTTACCGTC 


ACCCAGGACA 


, ACCTCAGCTG 


TCAGAGTCCC 


2160 


GAGAGCACCA 


GCTCTACCAG 


GGACCTGCTC 


TCCCAGTTCT 


CAGACTCCAG 


CCTCCACTGC 


2220 


CTCGAGCCCC 


CCTGCACCAA 


GTGGACACTC 


TCTTCGTTTG 


CAGAGAAGCA 


CTATGCTCCT 


2280 


TTCCTCCTGA 


AACCCAAAGC 


CAAGGTTGTG 


GTAATCCTTC 


TTTTCCTGGG 


CTTGCTGGGG 


2340 


GTCAGCCTTT 


ATGGGACCAC 


CCGAGTGAGA 


GACGGGCTGG 


ACCTCACGGA 


CATTGTTCCC 


2400 


CG^GAAACCA 


GAGAATATGA 


CTTCATAGCT 


GCCCAGTTCA 


AGTACTTCTC 


TTTCTACAAC 


2460 


ATGTATATAG 


TCACCCAGAA 


AGCAGACTAC 


CCGAATATCC 


AGCACCTACT 


TTACGACCTT 


2520 


CATAAGAGT? 


TCAGCAATGT 


GAAGTATGTC 


ATGCTGGAGG 


AGAACAAGCA 


ACTTCCCCAA 


2580 


ATGTGGCTGC 


ACTACTTTAG 


AGACTGGCTT 


CAAGGACTTC 


AGGATGCATT 


TGACAGTGAC 


2640 


TGGGAAACTG 


GGAGGATCAT 


GCCAAACAAT 


TATAAAAATG 


GATCAGATGA 


CGGGGTCCTC 


270C 


GCTTA.CAAAC 


TCCTGGTGCA 


GACTGGCAGC 


CGAGACAAGC 


CCATCGACAT 


TAGTCAGTTG 


2760 


ACTAAACAGC 


GTCTGGTAGA 


CGCAGATGGC 


ATCATTAATC 


CGAGCGCTTT 


CTACATCTAC 


2820 


CTGACCGCTT 


GGGTCAGCAA 


CGACCCTGTA 


GCTTACGCTG 


CCTCCCAGGC 


CAACATCCGG 


2880 


CCTCACCGGC 


CGGAGTGGGT 


CCATGACAAA 


GCCGACTACA 


TGCCAGAGAC 


CAGGCTGAGA 


2940 


ATCCCAGCAG 


CAGAGCCCAT 


CGAGTACGCT 


CAGTTCCCTT 


TCTACCTCAA 


CGGCCTACGA 


3000 


GACACCTCAG 


ACTTTGTGGA 


AGCCATAGAA 


AAAGTGAGAG 


TCATCTGTAA 


CAACTATACG 


3060 


AGCCTGGGAC 


TGTCCAGCTA 


CCCCAATGGC 


TACCCCTTCC 


TGTTCTGGGA 


GCAATACATC 


3120 


ACjC^w i GCGCC 


ACTGGCTGCT 


GCTATCCATC 


AGCGTGGTGC 


TGGCCTGCAC 


GTTTCTAGTG 


3180 


TGCGCAGTCT 


TCCTCCTGAA 


CCCCTGGACG 


GCCGGGATCA 


TTGTCATGGT 


CCTGGCTCTG 


3240 


ATGACCGT TG 


AGCTCTTTGG 


CATGATGGGC 


CTCATTGGGA 


TCAAGCTGAG 


TGCTGTGCCT 


3300 


u rCjtjTCATCC 


TGATTGCATC 


TGTTGGCATC 


GGAGTGGAGT 


TCACCGTCCA 


CGTGGCTTTG 


3360 


VjCv- a i TLTCjA 


CAGCCATTGG 


GGACAAGAAC 


CACAGGGCTA 


TGCTCGCTCT 


GGAACACATG 


3420 


rp m #7* T* /*» 

TT i (jCTCCCG 


TTCTGGACGG 


TGCTGTGTCC 


ACTCTGCTGG 


GTGTACTGAT 


GCT7GCAGGG 


3480 




ATTTCATTGT 


CAGATACTTC 


TTTGCCGTCC 


TGGCCATTCT 


CACCGTCTTG 


3540 




Ai CjCjAUTGGT 


TCTGCTGCCT 


GTCCTCTTAT 


CCTTCTTTGG 


ACCGTGTCCT 


3600 


GAGGTGTCTC 


CAGCCAATGG 


CCTAAACCGA 


CTGCCCACTC 


CTTCGCCTGA 


GCCGCCTCCA 


3660 


AGTGTCGTCC 


GGTTTGCCGT 


GCCTCCTGGT 


CACACGAACA 


ATGGGTCTGA 


TTCCTCCGAC 


3720 


TCGGAGTACA 


GCTCTCAGAC 


CACGGTGTCT 


GGCATCAGTG 


AGGAGCTCAG 


GCAATACGAA 


3780 


GCACAGCAGG 


GTGCCGGAGG 


CCCTGCCCAC 


CAAGTGATTG 


TGGAAGCCAC 


AGAAAACCCT 


3840 


GTCT7TGCCC 


GGTCCACTGT 


GGTCCATCCG 


GACTCCAGAC 


ATCAGCCTCC 


CTTGACCCCT 


3900 
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CGGCAACAGC CCCACCTGGA CTCTGGCTCC TTGTCCCCTG GACGGCAAGG CCAGCAGCCT 3960 
CGAAGGGATC CCCCTAGAGA AGGCTTGCGG CCACCCCCCT ACAGACCGCG CAGAGACGCT 4 02 0 
TTTGAAATTT CTACTGAAGG GCATTCTGGC CCTAGCAATA GGGACCGCTC AGGGCCCCGT 4 0 80 

GGGGCCCGTT CTCACAACCC TCGGAACCCA ACGTCCACCG CCATGGGCAG CTCTGTGCCC 4140 

AGCTACTGCC AGCCCATCAC CACTGTGACG GCTTCTGCTT CGGTGACTGT TGCTGTGCAT 420 0 

CCCCCGCCTG GACCTGGGCG CAACCCCCGA GGGGGGCCCT GTCCAGGCTA TGAGAGCTAC 4260 

CCTGAGACTG ATCACGGGGT ATTTGAGGAT CCTCATGTGC CTTTTCATGT CAGGTGTGAG 4320 

AGGAGGGACT CAAAGGTGGA GGTCATAGAG CTACAGGACG TGGAATGTGA GGAGAGGCCG 4 38 0 

TGGGGGAGCA GCTCCAACTG AGGGTAATTA AAATCTGAAG CAAAGAGGCC AAAGATTGGA 4 44 0 

AAGCCCCGCC CCCACCTCTT 7CCAGAAC1G CTTGAAGAGA ACTGCTT'GGA ATTAtGgGaa 450 0 

GGCAGTTCAT TGTTACTGTA ACTGATTGTA TTATTKKGTG AAATAT7TCT ATAAATATTT 456 0 

AARAGGTGTA CACATGTAAT ATACATGGAA ATGCTGTACA GTCTATTTCC TGGGGCCTCT 4 62 0 

CCACTCCTGC CCCAGAGTGG GGAGACCACA GGGGCCCTTT CCCCTGTGTA CATTGGTCTC 4 680 

TGTGCCACAA CCAAGCTTAA CTTAGTTTTA AAAAAAATCT CCCAGCATAT GTCGCTGCTG 47 4 0 

CTTAAA7ATT GTATAATTTA CTTGTATAAT TCTATGCAAA TATTGCTTAT GTAATAGGAT 4 80 0 

TATTTGTAAA GGTTTCTGTT TAAAATATTT TAAATTTGCA TATCACAACC CTGTGGTAGG 48 6 0 

ATGAATTGTT ACTGTTAACT TTTGAACACG CTATGCGTGG TAATTGTTTA ACGAGCAGAC 4 92 0 

ATGAAGAAAA CAGGTTAATC CCAGTGGCTT CTCTAGGGGT AGTTGTATAT GG7TCGCATG 4 98 0 

GGTGGATGTG TGTGTGCATG TGACTTTCCA ATGTACTGTA TTGTGGTTTG TTGTTGTTGT 504 0 

TGCTGTTGTT GTTCATTTTG GTGTTTTTGG TTGCTTTGTA TGATCTTAGC TCTGGCCTAG 5 100 

GTGGGCTGGG AAGGTCCAGG TCTTTTTCTG TCGTGATGCT GGTGGAAJ\GG TGACCCCAAT 5160 

CATCTGTCCT ATTCTCTGGG ACTATTC 5i8' 
(2) INFORMATION FOR SEQ ID NO;10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1434 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOrlO: 

Met Ala Ser Ala Gly Asn Ala Ala Gly Ala Leu Gly Arg Gin Ala Gly 
1 ^ 10 " 15 

Gly Giy Arg Arg Arg Arg Thr Gly Gly Pro His Arg Ala Ala Pro Asp 
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20 25 30 

Arg Asp Tyr Leu His Arg Pro Ser Tyr Cys Asp Ala Ala Phe Ala Leu 
35 40 45 

Glu Gin lie Ser Lys Gly Lys Ala Thr Gly Arg Lys Ala Pro Leu Trp 
50 55 60 

Leu Arg Ala Lys Phe Gin Arg Leu Leu Phe Lys Leu Gly Cys Tyr lie 
65 70 75 80 

Gin Lys Asn Cys Gly Lys Phe Leu Val Val Gly Leu Leu lie Phe Gly 
85 90 95 

Ala Phe Ala Val Gly Leu Lys Ala Ala Asn Leu Glu Thr Asn Val Glu 
100 105 110 

Glu Leu Trp Val Glu Val Gly Gly Arg Val Ser Arg Glu Leu Asn Tvr 

115 120 125 

Thr Arg Gin Lys lie Gly Glu Glu Aia Met Phe Asn Pro Gin Leu Mec 

130 135 140 

lie Gin Thr Pro Lys Glu Glu Gly Ala Asn Val Leu Thr Thr Glu Ala 
145 150 155 160 

Leu Leu Gin His Leu Asp Ser Aia Leu Gin Ala Ser Arg Val His Val 
165 170 175 

Tyr Met Tyr Asn Arg Gin Trp Lys Leu Glu His Leu Cys Tyr Lys Ser 
180 185 190 

Gly Glu Leu He Thr Glu Thr Gly Tyr Met Asp Gin He He Glu Tyr 
195 200 205 

Leu Tyr Pro Cys Leu lie He Thr Pro Leu Asp Cys Phe Trp Glu Gly 
210 215 220 

Ala Lys Leu Gin Ser Gly Thr Ala Tyr Leu Leu Gly Lys Pro Pro Leu 
225 230 235 240 

Arg Trp Thr Asn Phe Asp Pro Leu Glu Phe Leu Glu Glu Leu Lys Lys 
245 250 255 

lie Asn Tyr Gin Val Asp Ser Trp Glu Glu Met Leu Asn Lys Aia Glu 
260 265 270 

Val Gly His Gly Tyr Met Asp Arg Pro Cys Leu Asn Pro Ala Asp Pro 
275 280 285 

Asp Cys Pro Ala Thr Ala Pro Asn Lys Asn Ser Thr Lys Pro Leu Asp 
290 295 300 

Val Aia Leu Val Leu Asn Gly Gly Cys Gin Gly Leu Ser Arg Lys Tyr 
305 310 315 320 

Met His Trp Gin Glu Glu Leu He Val Gly Gly Thr Val Lys Asn Ala 
325 330 335 

Thr Gly Lys Leu Val Ser Ala His Ala Leu Gin Thr Met Phe Gin Leu 
340 345 350 

Met Thr Pro Lys Gin Met Tyr Glu His Phe Arg Gly Tyr Asp Tyr Val 
355 360 365 
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Ser His lie Asn Trp Asn Glu Asp Arg Ala Ala Ala lie Leu Glu Ala 
370 375 380 

Trp Gin Arg Thr Tyr Val Glu Val Val His Gin Ser Val Ala Pro Asn 
385 390 395 AQO 

Ser Thr Gin Lys Val Leu Pro Phe Thr Thr Thr Thr Leu Asp Asp lie 
405 410 " 415 

Leu Lys Ser Phe Ser Asp Val Ser Val lie Arg Val Ala Ser Gly Tyr 
420 425 430 

Leu Leu Met Leu Ala Tyr Ala Cys Leu Thr Met Leu Arg Trp Asp Cys 
435 440 445 

Ser Lys Ser Gin Gly Ala Vai Gly Leu Ala Gly Val Leu Leu Val Ala 
450 455 460 

Leu Ser Val Ala Ala Gly Leu Gly Leu Cys Ser Leu lie Gly lie Se: 
465 470 Alb 460 

Phe Asn Ala Ala Thr Thr Gin Val Leu Pro Phe Leu Ala Leu Gly Val 
485 490 495 

Gly Val Asp Asp Val Phe Leu Leu Ala His Ala Phe Ser Glu Thr Gly 
500 505 510 

Gin Asn Lys Arg lie Pro Phe Glu Asp Arg Thr Gly Glu Cys Leu Lys 
515 520 525 

Arg Thr Gly Ala Ser Val Ala Leu Thr Ser He Ser Asn Vai Thr Ala 
530 ' 535 540 

Phe Phe Met Ala Ala Leu He Pro He Pro Ala Leu Arg Ala Phe Ser 
545 550 555 " 560 

Leu Gin Ala Ala Val Val Val Val Phe Asn Phe Ala Met Val Leu Leu 
565 570 575 

He Phe Pro Ala He Leu Ser Met Asp Leu Tyr Arg Arg Glu Asp Arg 
580 585 590 

Arg Leu Asp He Phe Cys Cys Phe Thr Ser Pro Cys Val Set Arg Vai 
595 600 605 

He Gir. Val Glu Pro Gin Ala Tyr Thr Glu Pro His Ser Asn Thr Arg 
610 615 620 

Tyr Ser Pro Pro Pro Pro Tyr Thr Ser His Ser Phe Ala His Glu Thr 
625 630 635 640 

His He Thr Met Gin Ser Thr Val Gin Leu Arg Thr Glu Tyr Asp Pro 
645 650 655 

His Thr His Val Tyr Tyr Thr Thr Ala Glu Pro Arg Ser Glu He Ser 
660 665 670 

Val Gin Pro Val Thr Val Thr Gin Asp Asn Leu Ser Cys Gin Ser Pro 
675 680 685 

Glu Ser Thr Ser Ser Thr Arg Asp Leu Leu Ser Gin Phe Ser Asp Ser 
690 695 700 
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Ser Leu His Cys Leu Glu Pro Pro Cys Thr Lys Trp Thr Leu Ser Ser 
705 710 715 720 

Phe Ala Glu Lys His Tyr Aia Pro Phe Leu Leu Lys Pro Lys Ala Lys 
725 730 735 

Val val Val He Leu Leu Phe Leu Gly Leu Leu Gly Val Ser Leu Tyr 

740 745 750 

Gly Thr Thr Arg Val Arg Asp Gly Leu Asp Leu Thr Asp He Val Pro 

■^55 760 765 

Arg Glu Thr Arg Glu Tyr Asp Phe He Ala Aia Gin Phe Lys Tyr Phe 
770 775 780 

Ser Phe Tyr Asn Met Tyr He Val Thr Gin Lys Ala Asp Tyr Pro Asn 
"^85 790 795 800 

He Gin His Leu Leu Tyr Asp Leu His Lys Ser Phe Ser Asn Val Lys 
805 810 815 

Tyr Val Met Leu Giu Glu Asn Lys Gin Leu Pro Gin Met Trp Leu His 

820 825 630 

Tyr Phe Arg Asp Trp Leu Gin Gly Leu Gin Asp Ala Phe Asp Ser Asp 
835 840 845 

Trp Glu Thr Gly Arg He Met Pro Asn Asn Tyr Lys Asn Gly Ser Asp 
850 855 860 

Asp Gly Val Leu Ala Tyr Lys Leu Leu Val Gin Thr Gly Ser Arg Asp 
865 870 875 880 

Lys Pro He Asp He Ser Gin Leu Thr Lys Gin Arg Leu Val Asp Ala 
885 890 895 

Asp Gly He He Asn Pro Ser Ala Phe Tyr He Tyr Leu Thr Aia Trp 
900 905 910 

Val Ser Asn Asp Pro Val Ala Tyr Ala Ala Ser Gin Ala Asn He Arg 
915 920 925 

Pro His Arg Pro Glu Trp Val His Asp Lys Ala Asp Tyr Met Pro Glu 
930 935 

Thr Arg Leu Arg He Pro Ala Ala Glu Pro He Glu Tyr Aia Gin Phe 

950 955 960 

Pro Phe Tyr Leu Asn Gly Leu Arg Asp Thr Ser Asp Phe Val Giu Aia 
965 970 975 

He Glu Lys Val Arg Val He Cys Asn Asn Tyr Thr Ser Leu Gly Leu 
980 985 990 

Ser Ser Tyr Pro Asn Gly Tyr Pro Phe Leu Phe Trp Glu Gin Tyr He 
995 1000 1005 

Ser Leu Arg His Trp Leu Leu Leu Ser He Ser Val Val Leu Ala Cys 
1010 1015 1020 

Thr Pr.e Leu Val Cys Ala Val Phe Leu Leu Asn Pro Trp Thr Ala Gly 
1025 1030 1035 104C 
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He He Val Met Val Leu Ala Leu Met Thr Val Glu Leu Phe Gly Met 
1045 1050 1055 

Met Giy Leu lie Gly He Lys Leu Ser Ala Val Fro Val Val He Leu 
1060 1065 1070 

He Ala Ser Val Gly He Gly Val Glu Phe Thr Val His Val Ala Leo 
1075 1080 1085 

Ala Phe Leu Thr Ala He Gly Asp Lys Asn His Arg Ala Met Leu Ala 
1090 1095 1100 

Leu Glu His Met Phe Ala Pro Val Leu Asp Gly Ala val Ser Thr Leu 
1105 1110 1115 1120 

Leu Giy Val Leu Met Leu Ala Gly Ser Glu Phe Asp Phe He Vai Arg 
1125 1130 1135 

Tyr Phe Phe Ala Val Leu Ala He Leu Thr Val Leu Gly Val Leu Asn 
1140 H45 H50 

Gly Leu Vai Leu Leu Pro Val Leu Leu Ser Phe t r-.e Giy Pu Cys Pre 
1155 1160 1165 

Glu Val Ser Pro Ala Asn Gly Leu Asn Arg Leu Pro Thr Pro Ser Pro 
1170 1175 1180 

Giu Pro Pro Pro Ser Val Val Arg Phe Ala Vai Pro Pro Gly His Thr 
1185 1190 1195 ^ 1200 

Asn Asn Gly Ser Asp Ser Ser Asp Ser Glu Tyr Ser Ser Gin Thr Thr 
1205 1210 1215 

Val Ser Giy He Ser Glu Glu Leu Arg Gin Tyt GJ- Ala Gin Gin Gly 



Ala Gly Giy Pro Ala His Gin Val He Val Glu Ala Thr Giu Asn Pro 
1235 1240 1245 

Val Phe Ala Arg Ser Thr Val Val His Pro Asp Ser Arg MiS Gin Pro 
1250 1255 126C 

Pro Leu Thr Fro Arg Gin Gin Pro His Leu Asp Ser Gly Ser Leu Ser 
1265 1270 1270 129C 

Pro Gly Arg Gin Giy Gin Gin Pro Arg Arg Asp Pro Pre Arg Glu Giy 

1285 " 1290 1295 

Leu Arg Pro Pro Pro Tyr Arg Pro Arg Arg Asp Ala Phe Glu He Ser 
1300 1305 1310 

Thr Glu Gly His Ser Gly Pro Ser Asn Arg Asp Arg Ser Gly Pro Arg 
1315 1320 1325 

Gly Ala Arg Ser His Asn Pro Arg Asn Pro Thr Ser Thr Ala Met Gly 
1330 1335 ' 134G 

Ser Ser Val Pro Ser Tyr Cys Gin Pro He Thr Thr Val Thr Ala Ser 
1345 1350 1355 13C: 

Ala Ser Val Thr Val Ala Val His Pro Pro Pro Gly Pro Gly Arg Asn 

1365 1370 1375 
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Pro Arg Gly Gly Pro Cys Pro Giy Tyr Giu Ser Tyr Pro Giu Thr Asp 
138C 1385 1390 

His Gly Val Phe Giu Asp Pro His Val Pro Phe His Val Arg Cys Giu 
1395 1400 1405 

Arg Arg Asp Ser Lys Val Giu Val lie Giu Leu Gin Asp Val Giu Cys 
1410 1415 1420 

Giu Giu Arg Pro Trp Giy Ser Ser Ser Asn 
1425 1430 

(2) INFORMATION FOR SZQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: a-mo acid 

IC) £TRAND=:rNESS : single 
(D) ZZrZLZZ':: linear 

(li) MOLECL"L£ TV?!: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 1 : 

He He Thr Pre leu Asp Cys Phe Trp Giu Gly 
1 5 10 

(2) INFOPJVIATION TOR SEQ ID NO: 12: 

(1) SEO'JE.\-:£ CH.-..---.CTERISTIC3: 

(A) LENGTH: 5 axino acids 

(B) TYPE: a-ino acid 

(C) STRA>:dE2NESS : single 

(D) TGPOLOG;: linear 

(ii) MOLECULE TYPE: peptide 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
Leu He Val Gly Gly 



(2) INFORMATION* FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: ar.ino acid 

(C) STRANDE3NE5S: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(y.i) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



Pro Phe Phe Trp Clu Gin Tyr 
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1 5 
(2) INrORMATION FOR SEQ ID N0:14; 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 28 base pairs 

(B) TVPE: nucleic acid 

(C) STPANDEDMESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "primer" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

GGACGAATTC AARG^NCAYC ARYTNTGG 
{Zi INFORMATION FOR SEQ ID NO: 15: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

lii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc « "primer" 



txi) SEOJEN'CE DESCRIPTION: SlQ ID NO : 1 5 : 
GGACGAATTC CYTCCCARAA RCANTC 
(2) INFORMATION FOR SEQ ID NO: 16: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; other nucleic acid 
(A) DESCRIPTION: /desc = "primer" 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:16: 
GGACGAATTC YTNGANTGYT TYTGGGA 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 31 base pairs 
(&) T':rE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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<iij MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "primer" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Ch-:;:CC?,Qcc aagcttgtcn ggccartgca t 

(2) INFORM.IVTION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5288 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

GAATTCCGGG GACCGCAAGG AGTGCCGCGG AAGCGCCCGA AGGACAGGCT CGCTCGGCGC 60 

GCCGGCTCTC GCTCTTCCGC GAACTGGATG TGGGCAGCGG CGGCCGCAGA GACCTCGGGA 120 

CCCCCGCGCA ATGTGGCAAT GGAAGGCGCA GGGTCTGACT CCCCGGCAGC GGCCGCGGCC 180 

GCAGCGGCAG CAGCGCCCGC CGTGTGAGCA GCAGCAGCGG CTGGTCTGTC AACCGGAGCC 240 

CGAGCCCGAG CAGCCTGCGG CCAGCAGCGT CCTCGCAAGC CGAGCGCCCA GGCGCGCCAG 300 

GAGCCCGCAG CAGCGGCAGC AGCGCGCCGG GCCGCCCGGG AAGCCTCCGT CCCCGCGGCG 360 

GCGGCGGCGG CGGCGGCGGC AACATGGCCT CGGCTGGTAA CGCCGCCGAG CCCCAGGACC 42 0 

GCGGCGGCGG CGGCAGCGGC TGTATCGGTG CCCCGGGACG GCCGGCTGGA GGCGGGAGGC 48 0 

GCAGACGGAC GGGGGGGCTG CGCCGTGCTG CCGCGCCGGA CCGGGACTAT CTGCACCGGC 54 0 

.^CCAGCTACTG CGACGCCGCC TTCGCTCTGG AGCAGATTTC CAAGGGGAAG GCTACTGGCC 600 

GGAAAGCGCC ACTGTGGCTG AGAGCGAAGT TTCAGAGACT CTTATTTAAA CTGGGTTGTT 6 60 

A^A.-rCAAAA AAACTGCGGC AAGTTCTTGG TTGTGGGCCT CCTCATATTT GGGGCCTTCG 1 

CGGTGGGATT AAAAGCAGCG AACCTCGAGA CCAACGTGGA GGAGCTGTGG GTGGAAGTTG 780 

GAGGACGAGT AAGTCGTGAA TTAAATTATA CTCGCCAGAA GATTGGAGAA GAGGCTATGT 84 0 

TTAATCCTCA ACTCATGATA CAGACCCCTA AAGAAGAAGG TGCTAATGTC CTGACCACAG 900 

AAGCGCTCCT ACAACACCTG GACTCGGCAC TCCAGGCCAG CCGTGTCCAT GTATACATGT 960 

ACAACAGGCA GTGGAAATTG GAACATTTGT GTTACAAATC AGGAGAGCTT ATCACAGAAA 102 0 

CAG-T7ACAT GGATCAGATA ATAGAATATC TTTACCCTTG TTTGATTATT ACACC7TTGG :C?r 

AC7GC77CTG GGAAGGGGCG AAATTACAGT CTGGGACAGC ATACCTCCTA GGTAAACC7C ilO 
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CTTTGCGGTG GACAAACTTC GACCCTTTGG 
ATCAAGTGGA CAGCTGGGAG GAAATGCTGA 
ACCGCCCCTG CCTCAATCCG GCCGATCCAG 
CAACCAAACC 7CTTGATATG GCCCTTGTTT 
AGTATATGCA CTGGCAGGAG GAGTTGATTG 
AACTCGTCAG CGCCCATGCC CTGCAGACCA 
ACGAGCACTT CAAGGGGTAC GAGTATGTCT 
CAGCCATCCT GGAGGCCTGG CAGAGGACAT 
AGAACTCCAC TCAAAAGGTG CTTTCCTTCA 
CCTTCTCTGA CGTCAG7GTC ATCCGCGTGG 
CCTGTCTAAC CATGCTGCGC TGGGACTGCT 
^J^TwCTGCT GGTTGCACTG TCAGTGGCTG 
TTTCCTTTAA CGCTGCAACA ACTCAGGTTT 
ATGA7GTTTT TCTTCTGGCC CACGCCTTCA 
TTGAGGACAG GACCGGGGAG TGCC7GAAGC 
7CAGCAA7G7 CACAGCC77C 7TCA7GGCCG 
TCTCCCTCCA GGCAGCGG7A GTAGTGG7G7 
"Tw-JAATTCT CAGCATGGAT TTATATCGAC 
GT7TTACAAG CCCC7GCGTC AGCAGAGTGA 
CACACGACAA TACCCGC7AC AGCCCCCCAC 
AAACGCAGAT 7ACCA7GCAG TCCAC7GTCC 
ACG7G7AC7A CACCACCGC7 GAGCCGCGCT 
CACAGGACAC CC7CAGC7GC CAGAGCCCAG 
CCCAGT7CTC CGAC7CCAGC CTCCAC7GCC 
CATCTTT7GC 7GAGAAGCAC 7A7GCTCCTT 
TGA7CT7CC7 7T7TCTGGGC 77GC7GGGGG 
ACGGGC7GGA CC7TACGGAC A77G7ACCTC 
CACAA7TCAA ATAC7T77C7 77CTACAACA 
CGAA7A7CCA GCAC7TAC7T 7ACGACCTAC 
TG77GGAAGA AAACAAACAG C7TCCCAAAA 

h::r. ":;-.?rTCA gga7gcattt gacagtgact 

hQPJKQhAlQQ K'lQKQkZQh'V GGAGTCCTTG 
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AATTCCTGGA AGAGT7AAAG AAAA7AAACT 12 OC 

ATAAGGC7GA GGTTGG7CAT GGT7ACATGG 12 60 

ACTGCCCCGC CACAGCCCCC AACAAAAA7T 132 0 

TGAATGGTGG A7G7CATGGC 77A7CCAGAA 138 0 

7GGG7GGCAC AG7CAAGAAC AGCACTGGAA 1^4 0 

TGT7CCAG77 AA7GAC7CCC AAGCAAA7GT 1500 

CACACATCAA C7GGAACGAG GACAAAGCGG 15 60 

A7G7GGAGGT GGTTCATCAG AGTGTCGCAC 1620 

CCACCACGAC CCTGGACGAC A7CC7GAAA7 168 0 

CCAGCGGC7A C7TAC7CATG CTCGCCTA7G 17 4 0 

CCAAG7CCCA GGGTGCCGTG GGGC7GGC7G IBCC 

CAGGAC7GGG CCTG'i j::T::a IIGATCGGA/. Itc^ 

TGCCAT77CT CGC7CT7GGT G77GG7GTGG 192C 

G7GAAACAGG ACAGAA7AAJ^ AGAATCCCTT 198 0 

GCACAGGAGC CAGCGTGGCC C7CACG7CCA 204 C 

CGT7AA7CCC AA77CCCGCT C7GCGGGCG7 2 IOC 

7CAA7777GC CATGG77C7G C7CA777T7C 2160 

GCGAGGACAG GAGAC:Tr7-"A7 ATTTTC7GC7 Z:?\ 

T7CAGG77GA ACC7CAGGGC TACACCGACA 22 9 0 

C7CCCTACAG CAGCCACAGC T77GCCCATG 2 34 0 

AGC7CCGCAC GGAGTACGAC CCCCACACGC 2 4 00 

CCGAGA7C7C TGTGGAGCCC G7CACCG7GA 2 4 60 

AGAGCACCAG C7CCACAAGG GACC7GC7C7 2520 

TCGAGCCCCC C7GTACGAAG TGGACAC7C7 2 5 80 

TCCTC7TGAA ACCAA-AAGCC AAGG7AG7GG 2 64*? 

TCAGCCT7TA TGGCACCACC CG AG7GAGAG 27 00 

GGGAAACCAG AGAATA7GAC T7TA77GC7G 27 60 

7GTATA7AGT CACCCAGAAA GCAGAC7ACC 2820 

ACAGGAG7TT CAGTAACG7G AAG7A7G7CA 2 880 

TGTGGCTGCA C7AC77CAGA GAC7GGCTTC 2 94C 

GGGAAACCGG GAAAATCA7G CCAAACAA7r 30 C 'J 

CC7ACAAAC7 CC7GGTGCAA ACCGGCAGCC 30 6C 
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GCGATAAGCC CATCGACATC AGCCAGTTGA 
TCATTAATCC CAGCGCTTTC TACATCTACC 
CGTATGCTGC CTCCCAGGCC AACATCCGGC 
CCGACTACAT GCCTGAAACA AGGCTGAGAA 
AGTTCCCTTT CTACCTCAAC GGGTTGCGGG 
AAGTAAGGAC CATCTGCAGC AACTATACGA 
ACC-:CTTCCT CTTCTGGGAG CAGTACATCG 
GCGTGGTGTT GGCCTGCACA TTCCTCGTGT 
CCGGGATCAT TGTGATGGTC CTGGCGCTGA 
TCATCGGAAT CAAGCTCAGT GCCGTGCCCG 
GAGTGGAGTT CACCGTTCAC GTTGCTTTGG 
GCAGGGCTGT GCTTGCCCTG GAGCACATGT 
CTCTGCTGGG AGTGCTGATG CTGGCGGGAT 
TTG'JTGTGCT GGCGATCCTC ACCATCCTCG 
TGCTTTTGTC TTTCTTTGGA CCATATCC7G 
TGCCCACACC CTCCCCTGAG CCACCCCCCA 
ACACGCACAG CGGGTCTGAT TCCTCCGACT 
GCCTCAGCGA GGAGCTTCGG CACTACGAGG 
AAGTGATCGT GGAAGCCACA GAAAACCCCG 
AATCCAGGCA TCACCCACCC TCGAACCCGA 
TGCCTCCCGG ACGGCAAGGC CAGCAGCCCC 
CACCCCTCTA CAGACCGCGC AGAGACGCTT 
CTAGCAATAG GGCCCGCTGG GGCCCTCGCG 
CGTCCACTGC CATGGGCAGC TCCGTGCCCG 
CTTCTGCCTC CGTGACTGTC GCCGTGCACC 
CCCGAGGGGG ACTC7GCCCA GGCTACCCTG 
ACGTGCCTTT CCACGTCCGG TGTGAGAGGA 
AGGACGTGGA ATGCGAGGAG AGGCCCCGGG 
CTGAAGCAAA GAGGCCAAAG ATTGGAAACC 
GAAGAGAACT GGTTGGAGTT ATGGAAAAGA 
ACTGTAACCG ATTGTATTAT TTTGTTAAAT 
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CTAAACAGCG TCTGGTGGAT GCAGATGGCA 312C 

TGACGGCTTG GGTCAGCAAC GACCCCGTCG 3180 

CACACCGACC AGAATGGGTC CACGACAAAG 32 4 0 

TCCCGGCAGC AGAGCCCATC GAGTATGCCC 3 30 0 

ACACCTCAGA CTTTG7GGAG GCAATTGAAA 3360 

GCCTGGGGCT GTCCAGTTAC CCCAACGGCT 3 420 

GCCTCCGCCA CTGGCTGCTG CTGTTCATCA 3 4 8: 

GCGCTGTCTT CCTTCTGAAC CCCTGGACGG 354 0 

TGACGGTCGA GCTGTTCGGC ATGATGGGCC 3 60 0 

TGGTCATCCT GATCGCTTCT GTTGGCATAG 3660 

CCTTTCTGAC GGCCATCGGC GACAAGAACC 37 20 

TTGCACCCGT CCTGGATGGC GCCGTGTCCA 37 6^. 

CTGAGTTCGA CTTCATTGTC AGGTATTTCT 36 4 0 

GCGTTCTCAA TGGGCTGGTT TTGCTTCCCG 3 9CO 

AGGTGXCTCC AGCCAACGGC TTGAACCGCC 3 960 

GCGTGGTCCG CTTCGCCATG CCGCCCGGCC 4020 
CGGAGTATAG TTCCCAGACG ACAGTGTCAG , 40 80 

CCCAGCAGGG CGCGGGAGGC CCTGCCCACC 4140 

TCTTCGCCCA CTCCACTGTG GTCCATCCCG 4200 

GACAGCAGCC CCACCTGGAC TCAGGGTCCC 4:60 

GCAGGGACCC CCCCAGAGAA GGCTTGTGGC 4 31:: 

TTGAAATTTC 7ACTGAAGGG CA7TC7GGCC 4 380 

GGGCCCG77C 7CACAACCCT CGGAACCCAG 4 4 40 

GCTACTGCCA GCCCA7CACC AC7GTGACGG 4 500 

CGCCGCC7GT CCC7GGGCC7 GGGCGGAACC 4 5 60 

AGACTGACCA CGGCCTGT77 GAGGACCCCC 4 62 0 

GGGA7TCGAA GG7GGAAG7C A7TGAGCTGC 4 68 0 

GAAGCAGCTC CAACTGAGGG TGA7TAAAA7 47 4 0 

CCCCACCCCC ACCTCT77CC AGAAC7GC7T 4 800 

7GCCC7GTGC CAGGACAGCA GT7CA77G7X 4 860 

A7T7CTATAA ATA77TAAGA GA7G7ACACA 4 920 
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TGTGTAATAT AGGAAGGAAG GATGTAAAGT GGTATGATCT GGGGCTTCTC CACTCCTGCC 4 98 0 

CCAGAGTGTG GAGGCCACAG TGGGGCCTCT CCGTATTTGT GCATTGGGCT CCGTGCCACA 50 4 0 

ACCAAGCTTC ATTAGTCTTA AATTTCAGCA TATGTTGCTG CTGCTTAAAT ATTGTATAAT 5100 

TTACTTGTAT AATTCTATGC AAATATTGCT TATGTAATAG GATTATTTTG TAAAGGTTTC 5160 

TGTTTAAAAT ATTTTAAATT TGCATATCAC AACCCTGTGG TAGTATGAAA TGTTACTGTT 5220 

AACTTTCAAA CACGCTATGC GTGATAATTT TTTTGTTTAA TGAGCAGATA TGAAGAAAGC 52 80 
CCGGAATT 

(2) INFORMATION TOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: protein 



5288 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 9 : 

Met Ala Ser Ala Gly Asn Ala Ala Glu Pro Gin Asp Arg Gly Gly Gly 
15 10 15 

Gly Ser Gly Cys He Gly Ala Pro Gly Arg Pro Ala Gly Gly Glv Arg 
20 25 30 

Arg Arg Arg Thr Gly Gly Leu Arg Arg Ala Ala Ala Pro Asp Arg Asp 
35 40 45 

Tyr Leu His Arg Pro Ser Tyr Cys Asp Ala Ala Phe Ala Leu Glu Gin 
50 55 60 

He Ser Lys Gly Lys Ala Thr Gly Arg Lys Ala Pro Leu Trp Leu Ara 
65 70 75 eo" 

Ala Lys Phe Gin Arg Leu Leu Phe Lys Leu Gly Cys Tyr He Gin Lys 
85 90 95 

Asn Cys Gly Lys Phe Leu Val Val Gly Leu Leu He Phe Gly Ala Phe 
100 105 HO 

Ala Val Gly Leu Lys Ala Ala Asn Leu Glu Thr Asn Val Glu Glu Leu 
115 120 125 

Trp Val Glu Val Gly Gly Arg Val Ser Arg Glu Leu Asn Tyr Thr Arg 
i30 135 140 

Gin Lys He Gly Glu Glu Ala Met Phe Asn Pro Gin Leu Met He Gin 

I'^S 150 155 

Thr Pro Lys Glu Glu Gly Ala Asn Val Leu Thr Thr Glu Ala Leu Leu 

165 170 175 



Gin His Leu Asp Ser Ala Leu Gin Ala Ser Arg Val His Val 



Tyr Met 
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180 185 190 

Tyr Asn Arg Gin Trp Lys Leu Glu His Leu Cys Tyr Lys Ser Gly Glu 
195 200 205 

Leu lie Thr Glu Thr Gly Tyr Met Asp Gin lie lie Glu Tyr Leu Tyr 
210 215 220 

Pro Cys Leu lie lie Thr Pro Leu Asp Cys Phe Trp Glu Gly Ala Lys 
225 230 235 240 

Leu Gin Ser Gly Thr Ala Tyr Leu Leu Gly Lys Pro Pro Leu Arg Trp 
245 250 255 

Thr Asn Phe Asp Pro Leu Glu Phe Leu Glu Glu Leu Lys Lys lie Asn 
260 265 270 

Tyr Gin Val Asp Ser Trp Giu Glu Met Leu Asn Lys Ala Glu Val Gly 
275 280 285 

His Gly Tyr Met Asp Arg Pro Cys Leu Asn Pro Ala Asp Pro Asp Cys 
290 295 300 

Pro Ala Thr Ala Pro Asn Lys Asn Ser Thr Lys Pro Leu Asp Met Ala 
305 310 315 320 

Leu Val Leu Asn Gly Gly Cys His Gly Leu Ser Arg Lys Tyr Met His 
325 330 335 

Trp Gin Glu Glu Leu He Val Gly Gly Thr Val Lys Asn Ser Thr Gly 
340 345 350 

Lys Leu Val Ser Ala His Ala Leu Gin Thr Met Phe Gin Leu Met Thr 
355 360 365 

Pro Lys Gin Met Tyr Giu His Phe Lys Gly Tyr Glu Tyr Val Ser His 
3'?0 375 380 

He Asn Trp Asn Glu Asp Lys Ala Ala Ala He Leu Glu Ala Trp Gin 

390 395 

Arg Thr Tyr Vai Giu Val Val His Gin Ser Vai Ala Gin Asn Ser Th: 
405 410 415 

Gin Lys Val Leu Ser Phe Thr Thr Thr Thr Leu Asp Asp He Leu Lys 
420 425 430 

Ser Phe Ser Asp Val Ser Val He Arg Vai Ala Ser Gly Tyr Leu Leu 
435 440 445 

Met Leu Ala Tyr Ala Cys Leu Thr Met Leu Arg Trp Asp Cys Ser Lys 
450 455 460 

Ser Glr. Gly Ala Val Gly Leu Ala Gly Vai Leu Leu Val Ala Leu Ser 
^^^^ 470 475 480 

Val Ala Ala Gly Leu Gly Leu Cys Ser Leu He Gly He *Ser Phe Asn 
485 490 495 

Ala Ala Thr Thr Gin Val Leu Pro Phe Leu Ala Leu Gly Val Gly Val 
500 505 510 

Asp Asp Val Phe Leu Leu Ala His Ala Phe Ser Glu Thr Gly Gin Asn 
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515 520 525 

Lys Arg lie Pro Phe Glu Asp Arg Thr Gly Glu Cys Leu Lys Arg Thr 
530 535 540 

Gly Ala Ser Val Ala Leu Thr Ser He Ser Asn Val Thr Ala Phe Phe 
545 550 555 560 

Met Ala Ala Leu He Pro He Pro Ala Leu Arg Ala Phe Ser Leu Gin 
565 5-70 575 

Ala Ala Val Val Val Val Phe Asn Phe Ala Met Val Leu Leu He Phe 
580 585 590 

Pro Ala He Leu Ser Met Asp Leu Tyr Arg Arg Glu Asp Arg Arg Leu 
595 600 605 

Asp He Phe Cys Cys Phe Thr Ser Pro Cys Val Ser Arg Val He Gin 
610 615 620 

Va i Glu Pro Gin Ala Tyr Thi Asp Thr H::5 Asp A^n Tr. : A:g Tyr Set 
62S 630 635 b^C 

Pro Pro Pro Pro Tyr Ser Ser His Ser Phe Ala His Glu Thr Gin He 

645 650 655 

Thr Met Gin Ser Thr Val Gin Leu Arg Thr Glu Tyr Asp Pro His Thr 
660 665 670 

His Val Tyr Tyr Thr Thr Ala Glu Pro Arg Ser Glu Ho Ser Val Gin 
675 680 685 

Pro Val Thr Val Thr Gin Asp Thr Leu Ser Cys Gin Se: Pro Glu Ser 
690 '695 7C: 

Thr Sen Ser Thr Arg Asp Leu Leu Ser Gin Phe Ser Asp Ser Ser Leu 
705 710 715 720 

His Cys Leu Glu Pro Pro Cys Thr Lys Trp Thr Leu Ser Ser Phe Ala 
725 730 735 

Glu Lys His Tyr Ala Pro Phe Leu Leu Lys Pro Lys Ala Lys Val Val 
740 745 750 

Val He Phe Leu Phe Leu Gly Leu Leu Gly Val Ser Leu Tyr Gly Thr 
755 760 765 

Tnr Arg Vai Arg Asp Gly Leu Asp Leu Thr Asp He Vai Fio Arg Giu 
770 775 780 

Thr Arg Glu Tyr Asp Phe He Ala Ala Gin Phe Lys Tyr Phe Ser Phe 
785 790 795 800 

Tyr Asn Met Tyr He Val Thr Gin Lys Ala Asp Tyr Pro Asn He Gin 
805 810 815 

His Leu Leu Tyr Asp Leu His Arg Ser Phe Ser Asn Val Lys Tyr Val 
820 825 830 

M€L Leu Glu Glu Asr. Lys Gin Leu Pro Lys Met Tip Leu His Tyr Pr.e 
63^ S^O e^;: 

Arg Asp Trp Leu Gin Gly Leu Gin Asp Ala Phe Asp Ser Asp Trp Glu 
850 855 860 
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Thr Gly Lys He Met Pro Asn Asn Tyr Ly3 Asn Gly Ser Asp Asp Gly 
865 870 875 880 

Val Leu Ala Tyr Lys Leu Leu Val Gin Thr Gly Ser Arg Asp Lys Pro 
885 890 895 

He Asp He Ser Gin Leu Thr Lys Gin Arg Leu Val Asp Ala Asp Gly 
900 905 910 

He He Asn Pro Ser Ala Phe Tyr He Tyr Leu Thr Ala Trp Val Ser 
915 920 925 

Asn Asp Pro Val Ala Tyr Ala Ala Ser Gin Ala Asn He Arg Pro His 
930 935 940 

Arg Pro Glu Trp Val His Asp Lys Ala Asp Tyr Met Pro Glu Thr Arg 
945 950 955 960 

Leu Arg He Pro Ala Ala Glu Pro He Glu Tyr Ala Gin Phe Pro Phe 

965 970 97: 

Tyr Leu Asn Gly Leu Arg Asp Thr Ser Asp Phe Val Giu Aia He Glu 
980 985 990 

Lys Val Arg Thr He Cys Ser Asn Tyr Thr Ser Leu Gly Leu Ser Ser 
995 1000 1005 

Tyr Pro Asn Gly Tyr Pro Phe Leu Phe Trp Glu Gin Tyr He Gly Leu 
1010 1015 1020 

Arg His Trp Leu Leu Leu Phe He Ser Val Val Leu Ala Cys Thr Phe 
1025 1030 1035 1040 

Leu Val Cys Ala Val Phe Leu Leu Asn Pro Trp Thr Ala Gly lie lie 
lOAb 1050 iOtb 

Val Met Val Leu Ala Leu Met Thr Val Glu Leu Phe Gly Mec Met Gly 
1060 1065 1070 

Leu He Gly He Lys Leu Ser Ala Val Pro Val Val He Leu He Ala 
1075 1080 1085 

Ser Val Gly He Gly Val Glu Phe Thr Val His Val Ala Leu Ala Phe 
1090 1095 1100 

Leu Thr Ala He Gly Asp Lys Asn Arg Arg Ala Val Leu Ala Leu Glu 
1105 1110 1115 1120 

His Met Phe Ala Pro Val Leu Asp Gly Ala Val Ser Thr Leu Leu Gly 
1125 1130 1135 

Val Leu Met Leu Ala Gly Ser Glu Phe Asp Phe He Val Arg Tyr Phe 
1140 1145 1150 

Phe Ala Val Leu Ala He Leu Thr He Leu Gly Val Leu Asn Gly Leu 
H55 1160 1165 

Vdi Leu Leu Pro Val Leu Leu Ser Phe Phe Gly Pro Tyr Pro Glu Val 
H70 1175 1180 

Ser Pre Ala Asn Gly Leu Asn Arg Leu Pro Thr Pro Ser Pro Glu Pre 
1 185 1190 1 195 120;; 
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Pro Fro Set Val Val Arg Phe Ala Met Pro Pro Gly His Thr His Ser 
1205 1210 1215 

Gly Ser Asp Ser Ser Asp Ser Glu Tyr Ser Ser Gin Thr Thr Val Ser 
1220 1225 1230 

Gly Leu. Ser Glu Glu Leu Arg His Tyr Glu Ala Gin Gin Gly Ala Gly 
1235 1240 1245 

Gly Pro Ala His Gin Val He Val Glu Ala Thr Glu Asn Pro Val Phe 
1250 1255 1260 

Ala His Ser Thr Val Val His Pro Glu Ser Arg His His Pro Pro Ser 
1265 1270 1275 1280 

Asn Pro Arg Gin Gin Pro His Leu Asp Ser Gly Ser Leu Pro Pro Gly 
1285 1290 1295 

Arg Gin Gly Gin Gin Pro Arg Arg Asp Pro Pro Arg Glu Gly Leu Trp 
1300 1305 1310 

Pro Pro Leu Tyr Arg Pro Arg Arg Asp Ala Phe Glu He Ser Thr Glu 
1315 1320 1325 

Gly His Ser Gly Pro Ser Asn Arg Ala Arg Trp Gly Pro Arg Gly Ala 
1330 1335 * 1340 

Arg Ser His Asn Pro Arg Asn Pro Ala Ser Thr Ala Met Gly Ser Ser 
1345 1350 1355 1360 

Val Pro Gly Tyr Cys Gin Pro He Thr Thr Val Thr Ala Ser Ala Ser 
1365 1370 1375 

Val Thr Val Ala Val His Pro Pro Pro Val Pro Gly Pro Gly Arg Asn 
1380 1365 1390 

Pre Arg Gly Gly Leu Cys Pro Gly Tyr Pre Glu Thr Asp His Giy Leu 
1395 1400 1405 

Phe Glu Asp Pro His Val Pro Phe His Val Arg Cys Glu Arg Arg Asp 
1410 1415 1420 

Ser Lys Val Glu Val He Glu Leu Gin Asp Val Glu Cys Glu Glu Arg 
— 1425 1430 1435 1440 



Pro Arg Gly Ser Ser Ser Asn 
1445 
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5 WHAT IS CLAIMED IS: 

1. An isolated nucleic add encoding a patched protein other than Drosophila melanogaster 
patched protein, or fragment of at least about 12 nt in length thereof as other than an 
intact chromosome. 

10 2. An isolated nucleic add according to Claim 1 wherein said patched protein is mosquito 
butterily or beetle. 

3. An isolated nuddc acid according to Claim 1. wherein said patched protein is a 
mammalian protein. 

4. An isolated nuddc add according to Claim 3, wherein said patched protdn is human. 

In isolated nuddc add according to Claim 3. wherdn said patched protein is mouse. 

An expression cassette comprising a transcriptional initiation region functional in an 
expression host, a nuddc add having a sequence of o the isolated nucldc add according 
to Qaim 1 under the transcriptional regulation of said transcriptional initiation region, and 
a transcnptional termination region functional in said expression host. 

A cell comprising an expression cassette according to Claim 6 as part of an 
extrachromosomal dement or integrated into the genome of a host ceU as a result of 
mtroducoon of said expression cassette into said host cell and the cellular progeny of said 
host cell. r o J 

A method for producing patched protdn, said method comprising growing a cell 
according to Claim 7. whereby said patched protein is expressed; and isolating said 
patched protein free of other proteins. 

9. A purified polypeptide composition comprising at least 50 weight % of the protdn 
presoit as a patched protdn or a fragment thereof, other than Drosophila melanogaster 
patched prolan. * 

30 10. A purified polypeptide composition according to Claim 9. wherdn said patched protdn 
IS a mammalian protein. 

11. A purified polypeptide composition according to Claim 1 0, wherein said patched protdn 
is human. 

12. A purified polypeptide composition according to Claim 1 0, wherein said patched protdn 
35 IS mouse. 

13. A monodonal antibody binding spedfically to a patched protein other than Drosophila 
melanogaster patched protein. 

A method for diagnosing a genetic predisposition for at least one of devdopmental 
abnormahties and cancer in an individual, the method comprising: 

— detecting the presence of a predisposing mutation in a patched gene in the 
germline of said individual, 

wherdn the presence of said predisposing mutation indicates that said individual 
has a genetic predisposition for at least one of devdopmental abnormalities and 
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5 cancer. 

15. A method according to Claim 14, wherein said genetic predisposition is basal cell nevus 
syndrome. 

16. A method according to Claim 14, wherein said detecting step comprises analyzing the 
10 DNA of said individual. 

17. A method according to Claim 14, wherein said detecting step comprises functional 
analysis of patched protein function. 

18. A method according to Claim 14, wherein said detecting step comprisesdetecting 
antibody binding to abnormal patched protein. 

15 19. A method for characterizing the phenotype of a tumor, the method comprising: 

— detecting the presence of an oncogenic patched mutation in s^d tumor, wherein 
the presence of said oncogenic mutation indicates that said tumor has a patched- 
associated phenotype. 

20. A method according to Claim 19, wherein said tumor is a carcinoma. 

20 21 . A method according to Claim 20, wherein said carcinoma is a basal cell carcinoma. 

22. A method according to Claim 19, wherein said detecting step comprises analyzing the 
DNA of said tumor. 

23. A method according to Claim 19, wherein said detecting step comprises functional 
analysis of patched protein fiinction. 

25 24. A method according to Claim 19, wherein said detecting step comprises detecting 
antibody binding to abnormal patched protein. 

25. A graetically engineered mammalian cell predisposed to develop basal cell carcinoma as 
a result of transfection of said mammalian cell with at least one DNA construct 
comprising an altered patched or hedgehog gene. 
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